Variant cbh i polypeptides with reduced product inhibition

ABSTRACT

The present disclosure relates to variant CBH I polypeptides that have reduced product inhibition, and compositions, e.g., cellulase compositions, comprising variant CBH I polypeptides. The variant CBH I polypeptides and related compositions can be used in variety of agricultural and industrial applications. The present disclosure further relates to nucleic acids encoding variant CBH I polypeptides and host cells that recombinantly express the variant CBH I polypeptides.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a divisional application of U.S. application Ser. No. 13/824,317 filed Dec. 18, 2013, now issued as U.S. Pat. No. 9,096,871; which is a 35 USC §371 National Stage application of International Application No. PCT/US2011/055181 filed Oct. 6, 2011, now expired; which claims the benefit under 35 USC §119(e) to U.S. Application Ser. No. 61/390,392 filed Oct. 6, 2010, now expired. The disclosure of each of the prior applications is considered part of and is incorporated by reference in the disclosure of this application.

BACKGROUND OF THE INVENTION

Cellulose is an unbranched polymer of glucose linked by β(1→4)-glycosidic bonds. Cellulose chains can interact with each other via hydrogen bonding to form a crystalline solid of high mechanical strength and chemical stability. The cellulose chains are depolymerized into glucose and short oligosaccharides before organisms, such as the fermenting microbes used in ethanol production, can use them as metabolic fuel. Cellulase enzymes catalyze the hydrolysis of the cellulose (hydrolysis of β-1,4-D-glucan linkages) in the biomass into products such as glucose, cellobiose, and other cellooligosaccharides. Cellulase is a generic term denoting a multienzyme mixture comprising exo-acting cellobiohydrolases (CBHs), endoglucanases (EGs) and β-glucosidases (BGs) that can be produced by a number of plants and microorganisms. Enzymes in the cellulase of Trichoderma reesei include CBH I (more generally, Ce17A), CBH2 (Cel6A), EG1 (Cel7B), EG2 (Cel5), EG3 (Cel2), EG4 (Cel61A), EG5 (Cel45A), EG6 (Cel74A), Cip1, Cip2, β-glucosidases (including, e.g., Cel3A), acetyl xylan esterase, β-mannanase, and swollenin.

Cellulase enzymes work synergistically to hydrolyze cellulose to glucose. CBH I and CBH II act on opposing ends of cellulose chains (Barr et al., 1996, Biochemistry 35:586-92), while the endoglucanases act at internal locations in the cellulose. The primary product of these enzymes is cellobiose, which is further hydrolyzed to glucose by one or more β-glucosidases.

The cellobiohydrolases are subject to inhibition by their direct product, cellobiose, which results in a slowing down of saccharification reactions as product accumulates. There is a need for new and improved cellobiohyrolases with improved productivity that maintain their reaction rates during the course of a saccharification reaction, for use in the conversion of cellulose into fermentable sugars and for related fields of cellulosic material processing such as pulp and paper, textiles and animal feeds.

SUMMARY OF THE INVENTION

The present disclosure relates to variant CBH I polypeptides. Most naturally occurring CBH I polypeptides have arginines at positions corresponding to R268 and R411 of T. reesei CBH I (SEQ ID NO:2). The variant CBH I polypeptides of the present disclosure include a substitution at either or both positions resulting in a reduction or decrease in product (e.g., cellobiose) inhibition. Such variants are sometimes referred to herein as “product tolerant.”

The variant CBH I polypeptides of the disclosure minimally contain at least a CBH I catalytic domain, comprising (a) a substitution at the amino acid position corresponding to R268 of T. reesei CBH I (“R268 substitution”); (b) a substitution at the amino acid position corresponding to R411 of T. reesei CBH I (“R411 substitution”); or (c) both an R268 substitution and an R411 substitution. The amino acid positions of exemplary CBH I polypeptides into which R268 and/or R411 substitutions can be introduced are shown in Table 1, and the amino acid positions corresponding to R268 and/or R411 in these exemplary CBH I polypeptides are shown in Table 2.

R268 and/or R411 substituents can include lysines and/or alanines Accordingly, the present disclosure provides a variant CBH I polypeptide comprising a CBH I catalytic domain with one of the following amino acid substitutions or pairs of R268 and/or R411 substitutions: (a) R268K and R411K; (b) R268K and R411A; (c) R268A and R411K; (d) R268A and R411A; (e) R268A; (f) R268K; (g) R411A; and (h) R411K. In some embodiments, however, the amino acid sequence of the variant CBH I polypeptide does not comprise or consist of SEQ ID NO:299, SEQ ID NO:300, SEQ ID NO:301, or SEQ ID NO:302.

The variant CBHI polypeptides of the disclosure typically include a CD comprising an amino acid sequence having at least 50% sequence identity to a CD of a reference CBH I exemplified in Table 1. The CD portions of the CBH I polypeptides exemplified in Table 1 are delineated in Table 3. The variant CBH I polypeptides can have a cellulose binding domain (“CBD”) sequence in addition to the catalytic domain (“CD”) sequence. The CBD can be N- or C-terminal to the CD, and the CBD and CD are optionally connected via a linker sequence.

The variant CBH I polypeptides can be mature polypeptides or they may further comprise a signal sequence.

Additional embodiments of the variant CBH I polypeptides are provided in Section 0.

The variant CBH I polypeptides of the disclosure typically exhibit reduced product inhibition by cellobiose. In certain embodiments, the IC₅₀ of cellobiose towards a variant CBH I polypeptide of the disclosure is at least 1.2-fold, at least 1.5-fold, or at least 2-fold the IC₅₀ of cellobiose towards a reference CBH I lacking the R268 substitution and/or R411 substitution present in the variant. Additional embodiments of the product inhibition characteristics of the variant CBH I polypeptides are provided in Section 0.

The variant CBH I polypeptides of the disclosure typically retain some cellobiohydrolase activity. In certain embodiments, a variant CBH I polypeptide retains at least 50% the CBH I activity of a reference CBH I lacking the R268 substitution and/or R411 substitution present in the variant. Additional embodiments of cellobiohydrolase activity of the variant CBH I polypeptides are provided in Section 0.

The present disclosure further provides compositions (including cellulase compositions, e.g., whole cellulase compositions, and fermentation broths) comprising variant CBH I polypeptides. Additional embodiments of compositions comprising variant CBH I polypeptides are provided in Section 0. The variant CBH I polypeptides and compositions comprising them can be used, inter alia, in processes for saccharifying biomass. Additional details of saccharification reactions, and additional applications of the variant CBH I polypeptides, are provided in Section 0.

The present disclosure further provides nucleic acids (e.g., vectors) comprising nucleotide sequences encoding variant CBH I polypeptides as described herein, and recombinant cells engineered to express the variant CBH I polypeptides. The recombinant cell can be a prokaryotic (e.g., bacterial) or eukaryotic (e.g., yeast or filamentous fungal) cell. Further provided are methods of producing and optionally recovering the variant CBH I polypeptides. Additional embodiments of the recombinant expression system suitable for expression and production of the variant CBH I polypeptides are provided in Section 0.

BRIEF DESCRIPTION OF THE DRAWINGS AND TABLES

FIGS. 1A-1B: Cellobiose dose-response curves using a 4-MUL assay for a wild-type CBH I (BD29555; FIG. 1A) and a R268K/R411K variant CBH I (BD29555 with the substitutions R273K/R422K; FIG. 1B).

FIGS. 2A-2B: The effect of cellobiose accumulation on the activity of wild-type CBH I and a R268K/R411K variant CBH I, based on percent conversion of glucan after 72 hours in the bagasse assay. FIG. 2A shows relative activity in the presence (+) and absence (−) of β-glucosidase (BG), where relative activity is normalized to wild type activity with BG (WT+=1). FIG. 2B shows tolerance to cellobiose as a function of the ratio of activity in the absence vs. presence of β-glucosidase (activity ratio=Activity −BG/Activity +BG).

FIG. 3: Cellobiose dose-response curves using PASC assay for a R268K/R411K variant CBH I polypeptide as compared to two wild type CBH I polypeptides.

FIG. 4: The effect of cellobiose accumulation on the activity of a wild-type CBH I and a R268K/R411K variant CBH I based on percent conversion of glucan after 72 hours in the bagasse assay in the presence (+) and absence (−) of β-glucosidase (BG). Activity is normalized to wild type activity with BG (WT+=1).

FIG. 5: Characterization of cellobiose product tolerance of variant CBH I polypeptides, based on percent conversion of glucan after 72 hours in the absence and presence of β-glucosidase (BG) in the bagasse assay; tolerance is evaluated as a function of the ratio of activity in the absence vs. presence of β-glucosidase.

TABLE 1: Amino acid sequences of exemplary “reference” CBH I polypeptides that can be modified at positions corresponding to R268 and/or R411 in T. reesei CBH I (SEQ ID NO:2). The database accession numbers are indicated in the second column. Unless indicated otherwise, the accession numbers refer to the Genbank database. “#” indicates that the CBH I has no signal peptide; “&” indicate that the sequence is from the PDB database and represents the catalytic domain only without signal sequence; * indicates a nonpublic database. These amino acid sequences are mostly wild type, with the exception of some sequences from the PDB database which contain mutations to facilitate protein crystallization.

TABLE 2: Amino acid positions in the exemplary reference CBH I polypeptides that correspond to R268 and R411 in T. reesei CBH I. Database descriptors are as for Table 1.

TABLE 3: Approximate amino acid positions of CBH I polypeptide domains. Abbreviations used: SS is signal sequence; CD is catalytic domain; and CBD is cellulose binding domain. Database descriptors are as for Table 1.

TABLE 4: Table 4 shows a segment within the catalytic domain of each exemplary reference CBH I polypeptide containing the active site loop (shown in bold, underlined text) and the catalytic residues (glutamates in most CBH I polypeptides) (shown in bold, double underlined text). Database descriptors are as for Table 1.

TABLE 5: MUL and bagasse assay results for variants of BD29555. ND means not determined. ±% Activity (+/−cellobiose)=[(Activity with cellobiose)/(Activity without cellobiose)]*100. ¥ % Activity (−/+BG)=[(Activity without BG)/(Activity with BG)]*100]

TABLE 6: MUL and bagasse assay results for variants of T. reesei CBH I. ND means not determined. ±% Activity (+/−cellobiose)=[(Activity with cellobiose)/(Activity without cellobiose)]*100. ¥ % Activity (−/+BG)=[(Activity without BG)/(Activity with BG)]*100.

TABLE 7: Informal sequence listing. SEQ ID NO:1-149 correspond to the exemplary reference CBH I polypeptides. SEQ ID NO:299 corresponds to mature T. reesei CBH I (amino acids 26-529 of SEQ ID NO:2) with an R268A substitution. SEQ ID NO:300 corresponds to mature T. reesei CBH I (amino acids 26-529 of SEQ ID NO:2) with an R411A substitution. SEQ ID NO:301 corresponds to full length BD29555 with both an R268K substitution and an R411K substitution. SEQ ID NO:302 corresponds to mature BD29555 with both an R268K substitution and an R411K substitution.

DETAILED DESCRIPTION OF THE INVENTION

The present disclosure relates to variant CBH I polypeptides. Most naturally occurring CBH I polypeptides have arginines at positions corresponding to R268 and R411 of T. reesei CBH I (SEQ ID NO:2). The variant CBH I polypeptides of the present disclosure include a substitution at either or both positions resulting in a reduction of product (e.g., cellobiose) inhibition. The following subsections describe in greater detail the variant CBH I polypeptides and exemplary methods of their production, exemplary cellulase compositions comprising them, and some industrial applications of the polypeptides and cellulase compositions.

Variant CBH I Polypeptides

The present disclosure provides variant CBH I polypeptides comprising at least one amino acid substitution that results in reduced product inhibition. “Variant” means a polypeptide which is differs in sequence from a reference polypeptide by substitution of one or more amino acids at one or a number of different sites in the amino acid sequence. Exemplary reference CBH I polypeptides are shown in Table 1.

The variant CBH I polypeptides of the disclosure have an amino acid substitution at the amino acid position corresponding to R268 of T. reesei CBH I (SEQ ID NO:2) (an “R268 substitution”), (b) a substitution at the amino acid position corresponding to R411 of T. reesei CBH I (“R411 substitution”); or (c) both an R268 substitution and an R411 substitution, as compared to a reference CBH I polypeptide. It is noted that the R268 and R411 numbering is made by reference to the full length T. reesei CBH I, which includes a signal sequence that is generally absent from the mature enzyme. The corresponding numbering in the mature T. reesei CBH I (see, e.g., SEQ ID NO:4) is 8251 and R394, respectively.

Accordingly, the present disclosure provides variant CBH I polypeptides in which at least one of the amino acid positions corresponding to R268 and R411 of T. reesei CBH I, and optionally both the amino acid positions corresponding to R268 and R411 of T. reesei CBH I, is not an arginine.

The amino acid positions in the reference polypeptides of Table 1 that correspond to R268 and R411 in T. reesei CBH I are shown in Table 2. Amino acid positions in other CBH I polypeptides that correspond to R268 and R411 can be identified through alignment of their sequences with T. reesei CBH I using a sequence comparison algorithm. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, 1981, Adv. Appl. Math. 2:482-89; by the homology alignment algorithm of Needleman & Wunsch, 1970, J. Mol. Biol. 48:443-53; by the search for similarity method of Pearson & Lipman, 1988, Proc. Nat'l Acad. Sci. USA 85:2444-48, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by visual inspection.

The R268 and/or R411 substitutions are preferably selected from (a) R268K and R411K; (b) R268K and R411A; (c) R268A and R411K; (d) R268A and R411A; (e) R268A; (f) R268K; (g) R411A; and (h) R411K.

CBH I polypeptides belong to the glycosyl hydrolase family 7 (“GH7”). The glycosyl hydrolases of this family include endoglucanases and cellobiohydrolases (exoglucanases). The cellobiohydrolases act processively from the reducing ends of cellulose chains to generate cellobiose. Cellulases of bacterial and fungal origin characteristically have a small cellulose-binding domain (“CBD”) connected to either the N or the C terminus of the catalytic domain (“CD”) via a linker peptide (see Suumakki et al., 2000, Cellulose 7: 189-209). The CD contains the active site whereas the CBD interacts with cellulose by binding the enzyme to it (van Tilbeurgh et al., 1986, FEBS Lett. 204(2): 223-227; Tomme et al., 1988, Eur. J. Biochem. 170:575-581). The three-dimensional structure of the catalytic domain of T. reesei CBH I has been solved (Divne et al., 1994, Science 265:524-528). The CD consists of two β-sheets that pack face-to-face to form a β-sandwich. Most of the remaining amino acids in the CD are loops connecting the β-sheets. Some loops are elongated and bend around the active site, forming cellulose-binding tunnel of (˜50 Å). In contrast, endoglucanases have an open substrate binding cleft/groove rather than a tunnel. Typically, the catalytic residues are glutamic acids corresponding to E229 and E234 of T. reesei CBH I.

The loops characteristic of the active sites (“the active site loops”) of reference CBH I polypeptides, which are absent from GH7 family endoglucanases, as well as catalytic glutamate residues of the reference CBH I polypeptides, are shown in Table 4. The variant CBH I polypeptides of the disclosure preferably retain the catalytic glutamate residues or may include a glutamine instead at the position corresponding to E234, as for SEQ ID NO:4. In some embodiments, the variant CBH I polypeptides contain no substitutions or only conservative substitutions in the active site loops relative to the reference CBH I polypeptides from which the variants are derived.

Many CBH I polypeptides do not have a CBD, and most studies concerning the activity of cellulase domains on different substrates have been carried out with only the catalytic domains of CBH I polypeptides. Because CDs with cellobiohydrolase activity can be generated by limited proteolysis of mature CBH I by papain (see, e.g., Chen et al., 1993, Biochem. Mol. Biol. Int. 30(5):901-10), they are often referred to as “core” domains. Accordingly, a variant CBH I can include only the CD “core” of CBH I. Exemplary reference CDs comprise amino acid sequences corresponding to positions 26 to 455 of SEQ ID NO:1, positions 18 to 444 of SEQ ID NO:2, positions 26 to 455 of SEQ ID NO:3, positions 1 to 427 of SEQ ID NO:4, positions 24 to 457 of SEQ ID NO:5, positions 18 to 448 of SEQ ID NO:6, positions 27 to 460 of SEQ ID NO:7, positions 27 to 460 of SEQ ID NO:8, positions 20 to 449 of SEQ ID NO:9, positions 1 to 424 of SEQ ID NO:10, positions 18 to 447 of SEQ ID NO:11, positions 18 to 434 of SEQ ID NO:12, positions 18 to 445 of SEQ ID NO:13, positions 19 to 454 of SEQ ID NO:14, positions 19 to 443 of SEQ ID NO:15, positions 2 to 426 of SEQ ID NO:16, positions 23 to 446 of SEQ ID NO:17, positions 19 to 449 of SEQ ID NO:18, positions 23 to 446 of SEQ ID NO:19, positions 19 to 449 of SEQ ID NO:20, positions 2 to 416 of SEQ ID NO:21, positions 19 to 454 of SEQ ID NO:22, positions 19 to 447 of SEQ ID NO:23, positions 19 to 447 of SEQ ID NO:24, positions 20 to 443 of SEQ ID NO:25, positions 18 to 447 of SEQ ID NO:26, positions 19 to 442 of SEQ ID NO:27, positions 18 to 451 of SEQ ID NO:28, positions 23 to 446 of SEQ ID NO:29, positions 18 to 444 of SEQ ID NO:30, positions 18 to 451 of SEQ ID NO:31, positions 18 to 447 of SEQ ID NO:32, positions 19 to 449 of SEQ ID NO:33, positions 18 to 447 of SEQ ID NO:34, positions 26 to 459 of SEQ ID NO:35, positions 19 to 450 of SEQ ID NO:36, positions 19 to 453 of SEQ ID NO:37, positions 18 to 448 of SEQ ID NO:38, positions 19 to 443 of SEQ ID NO:39, positions 19 to 442 of SEQ ID NO:40, positions 18 to 444 of SEQ ID NO:41, positions 24 to 457 of SEQ ID NO:42, positions 18 to 449 of SEQ ID NO:43, positions 19 to 453 of SEQ ID NO:44, positions 26 to 456 of SEQ ID NO:45, positions 19 to 451 of SEQ ID NO:46, positions 18 to 443 of SEQ ID NO:47, positions 18 to 448 of SEQ ID NO:48, positions 19 to 451 of SEQ ID NO:49, positions 18 to 444 of SEQ ID NO:50, positions 2 to 419 of SEQ ID NO:51, positions 27 to 461 of SEQ ID NO:52, positions 21 to 445 of SEQ ID NO:53, positions 19 to 449 of SEQ ID NO:54, positions 19 to 448 of SEQ ID NO:55, positions 18 to 443 of SEQ ID NO:56, positions 20 to 443 of SEQ ID NO:57, positions 18 to 448 of SEQ ID NO:58, positions 18 to 447 of SEQ ID NO:59, positions 26 to 455 of SEQ ID NO:60, positions 19 to 449 of SEQ ID NO:61, positions 19 to 449 of SEQ ID NO:62, positions 26 to 460 of SEQ ID NO:63, positions 18 to 448 of SEQ ID NO:64, positions 19 to 451 of SEQ ID NO:65, positions 19 to 447 of SEQ ID NO:66, positions 1 to 424 of SEQ ID NO:67, positions 19 to 448 of SEQ ID NO:68, positions 19 to 443 of SEQ ID NO:69, positions 23 to 447 of SEQ ID NO:70, positions 17 to 448 of SEQ ID NO:71, positions 19 to 449 of SEQ ID NO:72, positions 18 to 444 of SEQ ID NO:73, positions 23 to 458 of SEQ ID NO:74, positions 20 to 452 of SEQ ID NO:75, positions 18 to 435 of SEQ ID NO:76, positions 18 to 446 of SEQ ID NO:77, positions 22 to 457 of SEQ ID NO:78, positions 18 to 448 of SEQ ID NO:79, positions 1 to 431 of SEQ ID NO:80, positions 19 to 453 of SEQ ID NO:81, positions 21 to 440 of SEQ ID NO:82, positions 19 to 442 of SEQ ID NO:83, positions 18 to 448 of SEQ ID NO:84, positions 17 to 446 of SEQ ID NO:85, positions 18 to 447 of SEQ ID NO:86, positions 18 to 443 of SEQ ID NO:87, positions 23 to 448 of SEQ ID NO:88, positions 18 to 451 of SEQ ID NO:89, positions 21 to 447 of SEQ ID NO:90, positions 18 to 444 of SEQ ID NO:91, positions 19 to 442 of SEQ ID NO:92, positions 20 to 436 of SEQ ID NO:93, positions 18 to 450 of SEQ ID NO:94, positions 22 to 453 of SEQ ID NO:95, positions 16 to 472 of SEQ ID NO:96, positions 21 to 445 of SEQ ID NO:97, positions 19 to 447 of SEQ ID NO:98, positions 19 to 450 of SEQ ID NO:99, positions 19 to 451 of SEQ ID NO:100, positions 18 to 448 of SEQ ID NO:101, positions 19 to 442 of SEQ ID NO:102, positions 20 to 457 of SEQ ID NO:103, positions 19 to 454 of SEQ ID NO:104, positions 18 to 440 of SEQ ID NO:105, positions 18 to 439 of SEQ ID NO:106, positions 27 to 460 of SEQ ID NO:107, positions 23 to 446 of SEQ ID NO:108, positions 17 to 446 of SEQ ID NO:109, positions 21 to 447 of SEQ ID NO:110, positions 19 to 447 of SEQ ID NO:111, positions 18 to 449 of SEQ ID NO:112, positions 22 to 457 of SEQ ID NO:113, positions 18 to 445 of SEQ ID NO:114, positions 18 to 448 of SEQ ID NO:115, positions 18 to 448 of SEQ ID NO:116, positions 23 to 435 of SEQ ID NO:117, positions 21 to 442 of SEQ ID NO:118, positions 23 to 435 of SEQ ID NO:119, positions 20 to 445 of SEQ ID NO:120, positions 21 to 443 of SEQ ID NO:121, positions 20 to 445 of SEQ ID NO:122, positions 23 to 443 of SEQ ID NO:123, positions 20 to 445 of SEQ ID NO:124, positions 21 to 435 of SEQ ID NO:125, positions 20 to 437 of SEQ ID NO:126, positions 21 to 442 of SEQ ID NO:127, positions 23 to 434 of SEQ ID NO:128, positions 20 to 444 of SEQ ID NO:129, positions 21 to 435 of SEQ ID NO:130, positions 20 to 445 of SEQ ID NO:131, positions 21 to 446 of SEQ ID NO:132, positions 21 to 435 of SEQ ID NO:133, positions 22 to 448 of SEQ ID NO:134, positions 23 to 433 of SEQ ID NO:135, positions 23 to 434 of SEQ ID NO:136, positions 23 to 435 of SEQ ID NO:137, positions 23 to 435 of SEQ ID NO:138, positions 20 to 445 of SEQ ID NO:139, positions 20 to 437 of SEQ ID NO:140, positions 21 to 435 of SEQ ID NO:141, positions 20 to 437 of SEQ ID NO:142, positions 21 to 435 of SEQ ID NO:143, positions 26 to 435 of SEQ ID NO:144, positions 23 to 435 of SEQ ID NO:145, positions 24 to 443 of SEQ ID NO:146, positions 20 to 445 of SEQ ID NO:147, positions 21 to 441 of SEQ ID NO:148, and positions 20 to 437 of SEQ ID NO:149.

The CBDs are particularly involved in the hydrolysis of crystalline cellulose. It has been shown that the ability of cellobiohydrolases to degrade crystalline cellulose decreases when the CBD is absent (Linder and Teeri, 1997, Journal of Biotechnol. 57:15-28). The variant CBH I polypeptides of the disclosure can further include a CBD. Exemplary CBDs comprise amino acid sequences corresponding to positions 494 to 529 of SEQ ID NO:1, positions 480 to 514 of SEQ ID NO:2, positions 494 to 529 of SEQ ID NO:3, positions 491 to 526 of SEQ ID NO:5, positions 477 to 512 of SEQ ID NO:6, positions 497 to 532 of SEQ ID NO:7, positions 504 to 539 of SEQ ID NO:8, positions 486 to 521 of SEQ ID NO:13, positions 556 to 596 of SEQ ID NO:15, positions 490 to 525 of SEQ ID NO:18, positions 495 to 530 of SEQ ID NO:20, positions 471 to 506 of SEQ ID NO:23, positions 481 to 516 of SEQ ID NO:27, positions 480 to 514 of SEQ ID NO:30, positions 495 to 529 of SEQ ID NO:35, positions 493 to 528 of SEQ ID NO:36, positions 477 to 512 of SEQ ID NO:38, positions 547 to 586 of SEQ ID NO:39, positions 475 to 510 of SEQ ID NO:40, positions 479 to 513 of SEQ ID NO:41, positions 506 to 541 of SEQ ID NO:42, positions 481 to 516 of SEQ ID NO:43, positions 503 to 537 of SEQ ID NO:45, positions 488 to 523 of SEQ ID NO:46, positions 476 to 511 of SEQ ID NO:48, positions 488 to 523 of SEQ ID NO:49, positions 479 to 513 of SEQ ID NO:50, positions 500 to 535 of SEQ ID NO:52, positions 493 to 528 of SEQ ID NO:55, positions 479 to 514 of SEQ ID NO:58, positions 494 to 529 of SEQ ID NO:60, positions 490 to 525 of SEQ ID NO:61, positions 497 to 532 of SEQ ID NO:62, positions 475 to 510 of SEQ ID NO:64, positions 477 to 512 of SEQ ID NO:65, positions 486 to 521 of SEQ ID NO:66, positions 470 to 505 of SEQ ID NO:67, positions 491 to 526 of SEQ ID NO:68, positions 476 to 511 of SEQ ID NO:69, positions 480 to 514 of SEQ ID NO:73, positions 506 to 540 of SEQ ID NO:74, positions 471 to 504 of SEQ ID NO:76, positions 501 to 536 of SEQ ID NO:78, positions 473 to 508 of SEQ ID NO:79, positions 481 to 516 of SEQ ID NO:83, positions 488 to 523 of SEQ ID NO:86, positions 475 to 510 of SEQ ID NO:92, positions 468 to 504 of SEQ ID NO:93, positions 501 to 536 of SEQ ID NO:96, positions 482 to 517 of SEQ ID NO:98, positions 481 to 516 of SEQ ID NO:99, positions 488 to 523 of SEQ ID NO:100, positions 472 to 507 of SEQ ID NO:101, positions 481 to 516 of SEQ ID NO:102, positions 471 to 505 of SEQ ID NO:105, positions 481 to 516 of SEQ ID NO:106, positions 495 to 530 of SEQ ID NO:107, positions 488 to 523 of SEQ ID NO:111, positions 478 to 513 of SEQ ID NO:112, positions 501 to 536 of SEQ ID NO:113, positions 491 to 526 of SEQ ID NO:115, and positions 503 to 538 of SEQ ID NO:116.

The CD and CBD are often connected via a linker. Exemplary linker sequences correspond to positions 456 to 493 of SEQ ID NO:1, positions 445 to 479 of SEQ ID NO:2, positions 456 to 493 of SEQ ID NO:3, positions 458 to 490 of SEQ ID NO:5, positions 449 to 476 of SEQ ID NO:6, positions 461 to 496 of SEQ ID NO:7, positions 461 to 503 of SEQ ID NO:8, positions 446 to 485 of SEQ ID NO:13, positions 444 to 555 of SEQ ID NO:15, positions 450 to 489 of SEQ ID NO:18, positions 450 to 494 of SEQ ID NO:20, positions 448 to 470 of SEQ ID NO:23, positions 443 to 480 of SEQ ID NO:27, positions 445 to 479 of SEQ ID NO:30, positions 460 to 494 of SEQ ID NO:35, positions 451 to 492 of SEQ ID NO:36, positions 449 to 476 of SEQ ID NO:38, positions 444 to 546 of SEQ ID NO:39, positions 443 to 474 of SEQ ID NO:40, positions 445 to 478 of SEQ ID NO:41, positions 458 to 505 of SEQ ID NO:42, positions 450 to 480 of SEQ ID NO:43, positions 457 to 502 of SEQ ID NO:45, positions 452 to 487 of SEQ ID NO:46, positions 449 to 475 of SEQ ID NO:48, positions 452 to 487 of SEQ ID NO:49, positions 445 to 478 of SEQ ID NO:50, positions 462 to 499 of SEQ ID NO:52, positions 449 to 492 of SEQ ID NO:55, positions 449 to 478 of SEQ ID NO:58, positions 456 to 493 of SEQ ID NO:60, positions 450 to 489 of SEQ ID NO:61, positions 450 to 496 of SEQ ID NO:62, positions 449 to 474 of SEQ ID NO:64, positions 452 to 476 of SEQ ID NO:65, positions 448 to 485 of SEQ ID NO:66, positions 425 to 469 of SEQ ID NO:67, positions 449 to 490 of SEQ ID NO:68, positions 444 to 475 of SEQ ID NO:69, positions 445 to 479 of SEQ ID NO:73, positions 459 to 505 of SEQ ID NO:74, positions 436 to 470 of SEQ ID NO:76, positions 458 to 500 of SEQ ID NO:78, positions 449 to 472 of SEQ ID NO:79, positions 443 to 480 of SEQ ID NO:83, positions 448 to 487 of SEQ ID NO:86, positions 443 to 474 of SEQ ID NO:92, positions 437 to 467 of SEQ ID NO:93, positions 473 to 500 of SEQ ID NO:96, positions 448 to 481 of SEQ ID NO:98, positions 451 to 480 of SEQ ID NO:99, positions 452 to 487 of SEQ ID NO:100, positions 449 to 471 of SEQ ID NO:101, positions 443 to 480 of SEQ ID NO:102, positions 441 to 470 of SEQ ID NO:105, positions 440 to 480 of SEQ ID NO:106, positions 461 to 494 of SEQ ID NO:107, positions 448 to 487 of SEQ ID NO:111, positions 450 to 478 of SEQ ID NO:112, positions 458 to 500 of SEQ ID NO:113, positions 449 to 490 of SEQ ID NO:115, and positions 449 to 502 of SEQ ID NO:116.

Because CBH I polypeptides are modular, the CBDs, CDs and linkers of different CBH I polypeptides, such as the exemplary CBH I polypeptides of Table 1, can be used interchangeably. However, in a preferred embodiment, the CBDs, CDs and linkers of a variant CBH I of the disclosure originate from the same polypeptide.

The variant CBH I polypeptides of the disclosure preferably have at least a two-fold reduction of product inhibition, such that cellobiose has an IC₅₀ towards the variant CBH I that is at least 2-fold the IC₅₀ of the corresponding reference CBH I, e.g., CBH I lacking the R268 substitution and/or R411 substitution. More preferably the IC₅₀ of cellobiose towards the variant CBH I is at least 3-fold, at least 5-fold, at least 8-fold, at least 10-fold, at least 12-fold or at least 15-fold the IC₅₀ of the corresponding reference CBH I. In specific embodiments the IC₅₀ of cellobiose towards the variant CBH I is ranges from 2-fold to 15-fold, from 2-fold to 10-fold, from 3-fold to 10-fold, from 5-fold to 12-fold, from 4-fold to 12-fold, from 5-fold to 10-fold, from 5-fold to 12-fold, from 2-fold to 8-fold, or from 8-fold to 20-fold the IC₅₀ of the corresponding reference CBH I. The IC₅₀ can be determined in a phosphoric acid swollen cellulose (“PASC”) assay (Du et al., 2010, Applied Biochemistry and Biotechnology 161:313-317) or a methylumbelliferyl lactoside (“MUL”) assay (van Tilbeurgh and Claeyssens, 1985, FEBS Letts. 187(2):283-288), as exemplified in the Examples below.

The variant CBH I polypeptides of the disclosure preferably have a cellobiohydrolase activity that is at least 30% the cellobiohydrolase activity of the corresponding reference CBH I, e.g., CBH I lacking the R268 substitution and/or R411 substitution. More preferably, the cellobiohydrolase activity of the variant CBH I is at least 40%, at least 50%, at least 60% or at least 70% the cellobiohydrolase activity of the corresponding reference CBH I. In specific embodiments the IC₅₀ cellobiohydrolase activity of the variant CBH I is ranges from 30% to 80%, from 40% to 70%, 30% to 60%, from 50% to 80% or from 60% to 80% of the cellobiohydrolase activity of the corresponding reference CBH I. Assays for cellobiohydrolase activity are described, for example, in Becker et al., 2011, Biochem J. 356:19-30 and Mitsuishi et al., 1990, FEBS Letts. 275:135-138, each of which is expressly incorporated by reference herein. The ability of CBH I to hydrolyze isolated soluble and insoluble substrates can also be measured using assays described in Srisodsuk et al., 1997, J. Biotech. 57:4957 and Nidetzky and Claeyssens, 1994, Biotech. Bioeng. 44:961-966. Substrates useful for assaying cellobiohydrolase activity include crystalline cellulose, filter paper, phosphoric acid swollen cellulose, cellooligosaccharides, methylumbelliferyl lactoside, methylumbelliferyl cellobioside, orthonitrophenyl lactoside, paranitrophenyl lactoside, orthonitrophenyl cellobioside, paranitrophenyl cellobioside. Cellobiohydrolase activity can be measured in an assay utilizing PASC as the substrate and a calcofluor white detection method (Du et al., 2010, Applied Biochemistry and Biotechnology 161:313-317). PASC can be prepared as described by Walseth, 1952, TAPPI 35:228-235 and Wood, 1971, Biochem. J. 121:353-362.

Other than said R268 and/or R411 substitution, the variant CBH I polypeptides of the disclosure preferably:

-   -   comprise an amino acid sequence having at least 50%, 51%, 52%,         53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%,         66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%,         79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%,         92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete         (100%) sequence identity to a CD of a reference CBH I         exemplified in Table 1 (i.e., a CD comprising an amino acid         sequence corresponding to positions 26 to 455 of SEQ ID NO:1,         positions 18 to 444 of SEQ ID NO:2, positions 26 to 455 of SEQ         ID NO:3, positions 1 to 427 of SEQ ID NO:4, positions 24 to 457         of SEQ ID NO:5, positions 18 to 448 of SEQ ID NO:6, positions 27         to 460 of SEQ ID NO:7, positions 27 to 460 of SEQ ID NO:8,         positions 20 to 449 of SEQ ID NO:9, positions 1 to 424 of SEQ ID         NO:10, positions 18 to 447 of SEQ ID NO:11, positions 18 to 434         of SEQ ID NO:12, positions 18 to 445 of SEQ ID NO:13, positions         19 to 454 of SEQ ID NO:14, positions 19 to 443 of SEQ ID NO:15,         positions 2 to 426 of SEQ ID NO:16, positions 23 to 446 of SEQ         ID NO:17, positions 19 to 449 of SEQ ID NO:18, positions 23 to         446 of SEQ ID NO:19, positions 19 to 449 of SEQ ID NO:20,         positions 2 to 416 of SEQ ID NO:21, positions 19 to 454 of SEQ         ID NO:22, positions 19 to 447 of SEQ ID NO:23, positions 19 to         447 of SEQ ID NO:24, positions 20 to 443 of SEQ ID NO:25,         positions 18 to 447 of SEQ ID NO:26, positions 19 to 442 of SEQ         ID NO:27, positions 18 to 451 of SEQ ID NO:28, positions 23 to         446 of SEQ ID NO:29, positions 18 to 444 of SEQ ID NO:30,         positions 18 to 451 of SEQ ID NO:31, positions 18 to 447 of SEQ         ID NO:32, positions 19 to 449 of SEQ ID NO:33, positions 18 to         447 of SEQ ID NO:34, positions 26 to 459 of SEQ ID NO:35,         positions 19 to 450 of SEQ ID NO:36, positions 19 to 453 of SEQ         ID NO:37, positions 18 to 448 of SEQ ID NO:38, positions 19 to         443 of SEQ ID NO:39, positions 19 to 442 of SEQ ID NO:40,         positions 18 to 444 of SEQ ID NO:41, positions 24 to 457 of SEQ         ID NO:42, positions 18 to 449 of SEQ ID NO:43, positions 19 to         453 of SEQ ID NO:44, positions 26 to 456 of SEQ ID NO:45,         positions 19 to 451 of SEQ ID NO:46, positions 18 to 443 of SEQ         ID NO:47, positions 18 to 448 of SEQ ID NO:48, positions 19 to         451 of SEQ ID NO:49, positions 18 to 444 of SEQ ID NO:50,         positions 2 to 419 of SEQ ID NO:51, positions 27 to 461 of SEQ         ID NO:52, positions 21 to 445 of SEQ ID NO:53, positions 19 to         449 of SEQ ID NO:54, positions 19 to 448 of SEQ ID NO:55,         positions 18 to 443 of SEQ ID NO:56, positions 20 to 443 of SEQ         ID NO:57, positions 18 to 448 of SEQ ID NO:58, positions 18 to         447 of SEQ ID NO:59, positions 26 to 455 of SEQ ID NO:60,         positions 19 to 449 of SEQ ID NO:61, positions 19 to 449 of SEQ         ID NO:62, positions 26 to 460 of SEQ ID NO:63, positions 18 to         448 of SEQ ID NO:64, positions 19 to 451 of SEQ ID NO:65,         positions 19 to 447 of SEQ ID NO:66, positions 1 to 424 of SEQ         ID NO:67, positions 19 to 448 of SEQ ID NO:68, positions 19 to         443 of SEQ ID NO:69, positions 23 to 447 of SEQ ID NO:70,         positions 17 to 448 of SEQ ID NO:71, positions 19 to 449 of SEQ         ID NO:72, positions 18 to 444 of SEQ ID NO:73, positions 23 to         458 of SEQ ID NO:74, positions 20 to 452 of SEQ ID NO:75,         positions 18 to 435 of SEQ ID NO:76, positions 18 to 446 of SEQ         ID NO:77, positions 22 to 457 of SEQ ID NO:78, positions 18 to         448 of SEQ ID NO:79, positions 1 to 431 of SEQ ID NO:80,         positions 19 to 453 of SEQ ID NO:81, positions 21 to 440 of SEQ         ID NO:82, positions 19 to 442 of SEQ ID NO:83, positions 18 to         448 of SEQ ID NO:84, positions 17 to 446 of SEQ ID NO:85,         positions 18 to 447 of SEQ ID NO:86, positions 18 to 443 of SEQ         ID NO:87, positions 23 to 448 of SEQ ID NO:88, positions 18 to         451 of SEQ ID NO:89, positions 21 to 447 of SEQ ID NO:90,         positions 18 to 444 of SEQ ID NO:91, positions 19 to 442 of SEQ         ID NO:92, positions 20 to 436 of SEQ ID NO:93, positions 18 to         450 of SEQ ID NO:94, positions 22 to 453 of SEQ ID NO:95,         positions 16 to 472 of SEQ ID NO:96, positions 21 to 445 of SEQ         ID NO:97, positions 19 to 447 of SEQ ID NO:98, positions 19 to         450 of SEQ ID NO:99, positions 19 to 451 of SEQ ID NO:100,         positions 18 to 448 of SEQ ID NO:101, positions 19 to 442 of SEQ         ID NO:102, positions 20 to 457 of SEQ ID NO:103, positions 19 to         454 of SEQ ID NO:104, positions 18 to 440 of SEQ ID NO:105,         positions 18 to 439 of SEQ ID NO:106, positions 27 to 460 of SEQ         ID NO:107, positions 23 to 446 of SEQ ID NO:108, positions 17 to         446 of SEQ ID NO:109, positions 21 to 447 of SEQ ID NO:110,         positions 19 to 447 of SEQ ID NO:111, positions 18 to 449 of SEQ         ID NO:112, positions 22 to 457 of SEQ ID NO:113, positions 18 to         445 of SEQ ID NO:114, positions 18 to 448 of SEQ ID NO:115,         positions 18 to 448 of SEQ ID NO:116, positions 23 to 435 of SEQ         ID NO:117, positions 21 to 442 of SEQ ID NO:118, positions 23 to         435 of SEQ ID NO:119, positions 20 to 445 of SEQ ID NO:120,         positions 21 to 443 of SEQ ID NO:121, positions 20 to 445 of SEQ         ID NO:122, positions 23 to 443 of SEQ ID NO:123, positions 20 to         445 of SEQ ID NO:124, positions 21 to 435 of SEQ ID NO:125,         positions 20 to 437 of SEQ ID NO:126, positions 21 to 442 of SEQ         ID NO:127, positions 23 to 434 of SEQ ID NO:128, positions 20 to         444 of SEQ ID NO:129, positions 21 to 435 of SEQ ID NO:130,         positions 20 to 445 of SEQ ID NO:131, positions 21 to 446 of SEQ         ID NO:132, positions 21 to 435 of SEQ ID NO:133, positions 22 to         448 of SEQ ID NO:134, positions 23 to 433 of SEQ ID NO:135,         positions 23 to 434 of SEQ ID NO:136, positions 23 to 435 of SEQ         ID NO:137, positions 23 to 435 of SEQ ID NO:138, positions 20 to         445 of SEQ ID NO:139, positions 20 to 437 of SEQ ID NO:140,         positions 21 to 435 of SEQ ID NO:141, positions 20 to 437 of SEQ         ID NO:142, positions 21 to 435 of SEQ ID NO:143, positions 26 to         435 of SEQ ID NO:144, positions 23 to 435 of SEQ ID NO:145,         positions 24 to 443 of SEQ ID NO:146, positions 20 to 445 of SEQ         ID NO:147, positions 21 to 441 of SEQ ID NO:148, and positions         20 to 437 of SEQ ID NO:149 (preferably the CD corresponding to         positions 26-455 of SEQ ID NO:1 or 18-444 of SEQ ID NO:2);         and/or     -   comprise an amino acid sequence having at least 50%, 51%, 52%,         53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%,         66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%,         79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%,         92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete         (100%) sequence identity to a mature polypeptide of a reference         CBH I exemplified in Table 1 (i.e., a mature protein comprising         an amino acid sequence corresponding to positions 26 to 529 of         SEQ ID NO:1, positions 18 to 514 of SEQ ID NO:2, positions 26 to         529 of SEQ ID NO:3, positions 1 to 427 of SEQ ID NO:4, positions         24 to 526 of SEQ ID NO:5, positions 18 to 512 of SEQ ID NO:6,         positions 27 to 532 of SEQ ID NO:7, positions 27 to 539 of SEQ         ID NO:8, positions 20 to 449 of SEQ ID NO:9, positions 1 to 424         of SEQ ID NO:10, positions 18 to 447 of SEQ ID NO:11, positions         18 to 434 of SEQ ID NO:12, positions 18 to 521 of SEQ ID NO:13,         positions 19 to 454 of SEQ ID NO:14, positions 19 to 596 of SEQ         ID NO:15, positions 2 to 426 of SEQ ID NO:16, positions 23 to         446 of SEQ ID NO:17, positions 19 to 525 of SEQ ID NO:18,         positions 23 to 446 of SEQ ID NO:19, positions 19 to 530 of SEQ         ID NO:20, positions 2 to 416 of SEQ ID NO:21, positions 19 to         454 of SEQ ID NO:22, positions 19 to 506 of SEQ ID NO:23,         positions 19 to 447 of SEQ ID NO:24, positions 20 to 443 of SEQ         ID NO:25, positions 18 to 447 of SEQ ID NO:26, positions 19 to         516 of SEQ ID NO:27, positions 18 to 451 of SEQ ID NO:28,         positions 23 to 446 of SEQ ID NO:29, positions 18 to 514 of SEQ         ID NO:30, positions 18 to 451 of SEQ ID NO:31, positions 18 to         447 of SEQ ID NO:32, positions 19 to 449 of SEQ ID NO:33,         positions 18 to 447 of SEQ ID NO:34, positions 26 to 529 of SEQ         ID NO:35, positions 19 to 528 of SEQ ID NO:36, positions 19 to         453 of SEQ ID NO:37, positions 18 to 512 of SEQ ID NO:38,         positions 19 to 586 of SEQ ID NO:39, positions 19 to 510 of SEQ         ID NO:40, positions 18 to 513 of SEQ ID NO:41, positions 24 to         541 of SEQ ID NO:42, positions 18 to 516 of SEQ ID NO:43,         positions 19 to 453 of SEQ ID NO:44, positions 26 to 537 of SEQ         ID NO:45, positions 19 to 523 of SEQ ID NO:46, positions 18 to         443 of SEQ ID NO:47, positions 18 to 511 of SEQ ID NO:48,         positions 19 to 523 of SEQ ID NO:49, positions 18 to 513 of SEQ         ID NO:50, positions 2 to 419 of SEQ ID NO:51, positions 27 to         535 of SEQ ID NO:52, positions 21 to 445 of SEQ ID NO:53,         positions 19 to 449 of SEQ ID NO:54, positions 19 to 528 of SEQ         ID NO:55, positions 18 to 443 of SEQ ID NO:56, positions 20 to         443 of SEQ ID NO:57, positions 18 to 514 of SEQ ID NO:58,         positions 18 to 447 of SEQ ID NO:59, positions 26 to 529 of SEQ         ID NO:60, positions 19 to 525 of SEQ ID NO:61, positions 19 to         532 of SEQ ID NO:62, positions 26 to 460 of SEQ ID NO:63,         positions 18 to 510 of SEQ ID NO:64, positions 19 to 512 of SEQ         ID NO:65, positions 19 to 521 of SEQ ID NO:66, positions 1 to         505 of SEQ ID NO:67, positions 19 to 526 of SEQ ID NO:68,         positions 19 to 511 of SEQ ID NO:69, positions 23 to 447 of SEQ         ID NO:70, positions 17 to 448 of SEQ ID NO:71, positions 19 to         449 of SEQ ID NO:72, positions 18 to 514 of SEQ ID NO:73,         positions 23 to 540 of SEQ ID NO:74, positions 20 to 452 of SEQ         ID NO:75, positions 18 to 504 of SEQ ID NO:76, positions 18 to         446 of SEQ ID NO:77, positions 22 to 536 of SEQ ID NO:78,         positions 18 to 508 of SEQ ID NO:79, positions 1 to 431 of SEQ         ID NO:80, positions 19 to 453 of SEQ ID NO:81, positions 21 to         440 of SEQ ID NO:82, positions 19 to 516 of SEQ ID NO:83,         positions 18 to 448 of SEQ ID NO:84, positions 17 to 446 of SEQ         ID NO:85, positions 18 to 523 of SEQ ID NO:86, positions 18 to         443 of SEQ ID NO:87, positions 23 to 448 of SEQ ID NO:88,         positions 18 to 451 of SEQ ID NO:89, positions 21 to 447 of SEQ         ID NO:90, positions 18 to 444 of SEQ ID NO:91, positions 19 to         510 of SEQ ID NO:92, positions 20 to 504 of SEQ ID NO:93,         positions 18 to 450 of SEQ ID NO:94, positions 22 to 453 of SEQ         ID NO:95, positions 16 to 536 of SEQ ID NO:96, positions 21 to         445 of SEQ ID NO:97, positions 19 to 517 of SEQ ID NO:98,         positions 19 to 516 of SEQ ID NO:99, positions 19 to 523 of SEQ         ID NO:100, positions 18 to 507 of SEQ ID NO:101, positions 19 to         516 of SEQ ID NO:102, positions 20 to 457 of SEQ ID NO:103,         positions 19 to 454 of SEQ ID NO:104, positions 18 to 505 of SEQ         ID NO:105, positions 18 to 516 of SEQ ID NO:106, positions 27 to         530 of SEQ ID NO:107, positions 23 to 446 of SEQ ID NO:108,         positions 17 to 446 of SEQ ID NO:109, positions 21 to 447 of SEQ         ID NO:110, positions 19 to 523 of SEQ ID NO:111, positions 18 to         513 of SEQ ID NO:112, positions 22 to 536 of SEQ ID NO:113,         positions 18 to 445 of SEQ ID NO:114, positions 18 to 526 of SEQ         ID NO:115, positions 18 to 538 of SEQ ID NO:116, positions 23 to         435 of SEQ ID NO:117, positions 21 to 442 of SEQ ID NO:118,         positions 23 to 435 of SEQ ID NO:119, positions 20 to 445 of SEQ         ID NO:120, positions 21 to 443 of SEQ ID NO:121, positions 20 to         445 of SEQ ID NO:122, positions 23 to 443 of SEQ ID NO:123,         positions 20 to 445 of SEQ ID NO:124, positions 21 to 435 of SEQ         ID NO:125, positions 20 to 437 of SEQ ID NO:126, positions 21 to         442 of SEQ ID NO:127, positions 23 to 434 of SEQ ID NO:128,         positions 20 to 444 of SEQ ID NO:129, positions 21 to 435 of SEQ         ID NO:130, positions 20 to 445 of SEQ ID NO:131, positions 21 to         446 of SEQ ID NO:132, positions 21 to 435 of SEQ ID NO:133,         positions 22 to 448 of SEQ ID NO:134, positions 23 to 433 of SEQ         ID NO:135, positions 23 to 434 of SEQ ID NO:136, positions 23 to         435 of SEQ ID NO:137, positions 23 to 435 of SEQ ID NO:138,         positions 20 to 445, of SEQ ID NO:139, positions 20 to 437 of         SEQ ID NO:140, positions 21 to 435 of SEQ ID NO:141, positions         20 to 437 of SEQ ID NO:142, positions 21 to 435 of SEQ ID         NO:143, positions 26 to 435 of SEQ ID NO:144, positions 23 to         435 of SEQ ID NO:145, positions 24 to 443 of SEQ ID NO:146,         positions 20 to 445 of SEQ ID NO:147, positions 21 to 441 of SEQ         ID NO:148, and positions 20 to 437 of SEQ ID NO:149, preferably         the mature polypeptide corresponding to positions 26-529 of SEQ         ID NO:1 or 18-514 of SEQ ID NO:2).

An example of an algorithm that is suitable for determining sequence similarity is the BLAST algorithm, which is described in Altschul et al., 1990, J. Mol. Biol. 215:403-410. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence that either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. These initial neighborhood word hits act as starting points to find longer HSPs containing them. The word hits are expanded in both directions along each of the two sequences being compared for as far as the cumulative alignment score can be increased. Extension of the word hits is stopped when: the cumulative alignment score falls off by the quantity X from a maximum achieved value; the cumulative score goes to zero or below; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLAST program uses as defaults a word length (W) of 11, the BLOSUM62 scoring matrix (see Henikoff & Henikoff, 1992, Proc. Nat'l. Acad. Sci. USA 89:10915-10919) alignments (B) of 50, expectation (E) of 10, M'S, N′-4, and a comparison of both strands.

Most CBH I polypeptides are secreted and are therefore expressed with a signal sequence that is cleaved upon secretion of the polypeptide from the cell. Accordingly, in certain aspects, the variant CBH I polypeptides of the disclosure further include a signal sequence. Exemplary signal sequences comprise amino acid sequences corresponding to positions 1 to 25 of SEQ ID NO:1, positions 1 to 17 of SEQ ID NO:2, positions 1 to 25 of SEQ ID NO:3, positions 1 to 23 of SEQ ID NO:5, positions 1 to 17 of SEQ ID NO:6, positions 1 to 26 of SEQ ID NO:7, positions 1 to 27 of SEQ ID NO:8, positions 1 to 19 of SEQ ID NO:9, positions 1 to 17 of SEQ ID NO:11, positions 1 to 17 of SEQ ID NO:12, positions 1 to 17 of SEQ ID NO:13, positions 1 to 18 of SEQ ID NO:14, positions 1 to 18 of SEQ ID NO:15, positions 1 to 22 of SEQ ID NO:17, positions 1 to 18 of SEQ ID NO:18, positions 1 to 22 of SEQ ID NO:19, positions 1 to 18 of SEQ ID NO:20, positions 1 to 18 of SEQ ID NO:22, positions 1 to 18 of SEQ ID NO:23, positions 1 to 18 of SEQ ID NO:24, positions 1 to 19 of SEQ ID NO:25, positions 1 to 17 of SEQ ID NO:26, positions 1 to 18 of SEQ ID NO:27, positions 1 to 17 of SEQ ID NO:28, positions 1 to 22 of SEQ ID NO:29, positions 1 to 18 of SEQ ID NO:30, positions 1 to 17 of SEQ ID NO:31, positions 1 to 17 of SEQ ID NO:32, positions 1 to 18 of SEQ ID NO:33, positions 1 to 17 of SEQ ID NO:34, positions 1 to 25 of SEQ ID NO:35, positions 1 to 18 of SEQ ID NO:36, positions 1 to 18 of SEQ ID NO:37, positions 1 to 17 of SEQ ID NO:38, positions 1 to 18 of SEQ ID NO:39, positions 1 to 18 of SEQ ID NO:40, positions 1 to 17 of SEQ ID NO:41, positions 1 to 23 of SEQ ID NO:42, positions 1 to 17 of SEQ ID NO:43, positions 1 to 18 of SEQ ID NO:44, positions 1 to 25 of SEQ ID NO:45, positions 1 to 18 of SEQ ID NO:46, positions 1 to 17 of SEQ ID NO:47, positions 1 to 17 of SEQ ID NO:48, positions 1 to 18 of SEQ ID NO:49, positions 1 to 17 of SEQ ID NO:50, positions 1 to 26 of SEQ ID NO:52, positions 1 to 20 of SEQ ID NO:53, positions 1 to 18 of SEQ ID NO:54, positions 1 to 18 of SEQ ID NO:55, positions 1 to 17 of SEQ ID NO:56, positions 1 to 19 of SEQ ID NO:57, positions 1 to 17 of SEQ ID NO:58, positions 1 to 17 of SEQ ID NO:59, positions 1 to 25 of SEQ ID NO:60, positions 1 to 18 of SEQ ID NO:61, positions 1 to 18 of SEQ ID NO:62, positions 1 to 25 of SEQ ID NO:63, positions 1 to 17 of SEQ ID NO:64, positions 1 to 18 of SEQ ID NO:65, positions 1 to 18 of SEQ ID NO:66, positions 1 to 18 of SEQ ID NO:68, positions 1 to 18 of SEQ ID NO:69, positions 1 to 23 of SEQ ID NO:70, positions 1 to 17 of SEQ ID NO:71, positions 1 to 18 of SEQ ID NO:72, positions 1 to 17 of SEQ ID NO:73, positions 1 to 22 of SEQ ID NO:74, positions 1 to 19 of SEQ ID NO:75, positions 1 to 17 of SEQ ID NO:76, positions 1 to 17 of SEQ ID NO:77, positions 1 to 21 of SEQ ID NO:78, positions 1 to 18 of SEQ ID NO:79, positions 1 to 18 of SEQ ID NO:81, positions 1 to 20 of SEQ ID NO:82, positions 1 to 18 of SEQ ID NO:83, positions 1 to 17 of SEQ ID NO:84, positions 1 to 16 of SEQ ID NO:85, positions 1 to 17 of SEQ ID NO:86, positions 1 to 17 of SEQ ID NO:87, positions 1 to 22 of SEQ ID NO:88, positions 1 to 17 of SEQ ID NO:89, positions 1 to 20 of SEQ ID NO:90, positions 1 to 17 of SEQ ID NO:91, positions 1 to 18 of SEQ ID NO:92, positions 1 to 19 of SEQ ID NO:93, positions 1 to 17 of SEQ ID NO:94, positions 1 to 21 of SEQ ID NO:95, positions 1 to 15 of SEQ ID NO:96, positions 1 to 20 of SEQ ID NO:97, positions 1 to 18 of SEQ ID NO:98, positions 1 to 18 of SEQ ID NO:99, positions 1 to 18 of SEQ ID NO:100, positions 1 to 17 of SEQ ID NO:101, positions 1 to 18 of SEQ ID NO:102, positions 1 to 19 of SEQ ID NO:103, positions 1 to 18 of SEQ ID NO:104, positions 1 to 17 of SEQ ID NO:105, positions 1 to 17 of SEQ ID NO:106, positions 1 to 26 of SEQ ID NO:107, positions 1 to 22 of SEQ ID NO:108, positions 1 to 16 of SEQ ID NO:109, positions 1 to 20 of SEQ ID NO:110, positions 1 to 18 of SEQ ID NO:111, positions 1 to 17 of SEQ ID NO:112, positions 1 to 21 of SEQ ID NO:113, positions 1 to 17 of SEQ ID NO:114, positions 1 to 17 of SEQ ID NO:115, positions 1 to 18 of SEQ ID NO:116, positions 1 to 22 of SEQ ID NO:117, positions 1 to 20 of SEQ ID NO:118, positions 1 to 22 of SEQ ID NO:119, positions 1 to 19 of SEQ ID NO:120, positions 1 to 20 of SEQ ID NO:121, positions 1 to 19 of SEQ ID NO:122, positions 1 to 22 of SEQ ID NO:123, positions 1 to 19 of SEQ ID NO:124, positions 1 to 20 of SEQ ID NO:125, positions 1 to 19 of SEQ ID NO:126, positions 1 to 21 of SEQ ID NO:127, positions 1 to 22 of SEQ ID NO:128, positions 1 to 19 of SEQ ID NO:129, positions 1 to 20 of SEQ ID NO:130, positions 1 to 19 of SEQ ID NO:131, positions 1 to 20 of SEQ ID NO:132, positions 1 to 20 of SEQ ID NO:133, positions 1 to 21 of SEQ ID NO:134, positions 1 to 22 of SEQ ID NO:135, positions 1 to 22 of SEQ ID NO:136, positions 1 to 22 of SEQ ID NO:137, positions 1 to 22 of SEQ ID NO:138, positions 1 to 19 of SEQ ID NO:139, positions 1 to 19 of SEQ ID NO:140, positions 1 to 20 of SEQ ID NO:141, positions 1 to 19 of SEQ ID NO:142, positions 1 to 20 of SEQ ID NO:143, positions 1 to 25 of SEQ ID NO:144, positions 1 to 22 of SEQ ID NO:145, positions 1 to 23 of SEQ ID NO:146, positions 1 to 19 of SEQ ID NO:147, positions 1 to 20 of SEQ ID NO:148, and positions 1 to 19 of SEQ ID NO:149.

Recombinant Expression of Variant CBH I Polypeptides Cell Culture Systems

The disclosure also provides recombinant cells engineered to express variant CBH I polypeptides. Suitably, the variant CBH I polypeptide is encoded by a nucleic acid operably linked to a promoter.

Where recombinant expression in a filamentous fungal host is desired, the promoter can be a filamentous fungal promoter. The nucleic acids can be, for example, under the control of heterologous promoters. The variant CBH I polypeptides can also be expressed under the control of constitutive or inducible promoters. Examples of promoters that can be used include, but are not limited to, a cellulase promoter, a xylanase promoter, the 1818 promoter (previously identified as a highly expressed protein by EST mapping Trichoderma). For example, the promoter can suitably be a cellobiohydrolase, endoglucanase, or β-glucosidase promoter. A particularly suitable promoter can be, for example, a T. reesei cellobiohydrolase, endoglucanase, or β-glucosidase promoter. Non-limiting examples of promoters include a cbh1, cbh2, egl1, eg12, eg13, eg14, eg15, pki1, gpdl, xyn1, or xyn2 promoter.

Suitable host cells include cells of any microorganism (e.g., cells of a bacterium, a protist, an alga, a fungus (e.g., a yeast or filamentous fungus), or other microbe), and are preferably cells of a bacterium, a yeast, or a filamentous fungus.

Suitable host cells of the bacterial genera include, but are not limited to, cells of Escherichia, Bacillus, Lactobacillus, Pseudomonas, and Streptomyces. Suitable cells of bacterial species include, but are not limited to, cells of Escherichia coli, Bacillus subtilis, Bacillus licheniformis, Lactobacillus brevis, Pseudomonas aeruginosa, and Streptomyces lividans.

Suitable host cells of the genera of yeast include, but are not limited to, cells of Saccharomyces, Schizosaccharomyces, Candida, Hansenula, Pichia, Kluyveromyces, and Phaffia. Suitable cells of yeast species include, but are not limited to, cells of Saccharomyces cerevisiae, Schizosaccharomyces pombe, Candida albicans, Hansenula polymorphs, Pichia pastoris, P. canadensis, Kluyveromyces marxianus, and Phaffia rhodozyma.

Suitable host cells of filamentous fungi include all filamentous forms of the subdivision Eumycotina. Suitable cells of filamentous fungal genera include, but are not limited to, cells of Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Chrysoporium, Coprinus, Coriolus, Corynascus, Chaetomium, Cryptococcus, Filobasidium, Fusarium, Gibberella, Humicola, Hypocrea, Magnaporthe, Mucor, Myceliophthora, Mucor, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Phanerochaete, Phlebia, Piromyces, Pleurotus, Scytaldium, Schizophyllum, Sporotrichum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, Trametes, and Trichoderma. More preferably, the recombinant cell is a Trichoderma sp. (e.g., Trichoderma reesei), Penicillium sp., Humicola sp. (e.g., Humicola insolens); Aspergillus sp. (e.g., Aspergillus niger), Chrysosporium sp., Fusarium sp., or Hypocrea sp. Suitable cells can also include cells of various anamorph and teleomorph forms of these filamentous fungal genera.

Suitable cells of filamentous fungal species include, but are not limited to, cells of Aspergillus awamori, Aspergillus fumigatus, Aspergillus foetidus, Aspergillus japonicus, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Chrysosporium lucknowense, Fusarium bactridioides, Fusarium cerealis, Fusarium crookwellense, Fusarium culmorum, Fusarium graminearum, Fusarium graminum, Fusarium heterosporum, Fusarium negundi, Fusarium oxysporum, Fusarium reticulatum, Fusarium roseum, Fusarium sambucinum, Fusarium sarcochroum, Fusarium sporotrichioides, Fusarium sulphureum, Fusarium torulosum, Fusarium trichothecioides, Fusarium venenatum, Bjerkandera adusta, Ceriporiopsis aneirina, Ceriporiopsis aneirina, Ceriporiopsis caregiea, Ceriporiopsis gilvescens, Ceriporiopsis pannocinta, Ceriporiopsis rivulosa, Ceriporiopsis subrufa, Ceriporiopsis subvermispora, Coprinus cinereus, Coriolus hirsutus, Humicola insolens, Humicola lanuginosa, Mucor miehei, Myceliophthora thermophile, Neurospora crassa, Neurospora intermedia, Penicillium purpurogenum, Penicillium canescens, Penicillium solitum, Penicillium funiculosum, Phanerochaete chrysosporium, Phlebia radiate, Pleurotus eryngii, Talaromyces flavus, Thielavia terrestris, Trametes villosa, Trametes versicolor, Trichoderma harzianum, Trichoderma koningii, Trichoderma longibrachiatum, Trichoderma reesei, and Trichoderma viride.

The engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants, or amplifying the nucleic acid sequence encoding the variant CBH I polypeptide. Culture conditions, such as temperature, pH and the like, are those previously used with the host cell selected for expression, and will be apparent to those skilled in the art. As noted, many references are available for the culture and production of many cells, including cells of bacterial and fungal origin. Cell culture media in general are set forth in Atlas and Parks (eds.), 1993, The Handbook of Microbiological Media, CRC Press, Boca Raton, Fla., which is incorporated herein by reference. For recombinant expression in filamentous fungal cells, the cells are cultured in a standard medium containing physiological salts and nutrients, such as described in Pourquie et al., 1988, Biochemistry and Genetics of Cellulose Degradation, eds. Aubert, et al., Academic Press, pp. 71-86; and Ilmen et al., 1997, Appl. Environ. Microbiol. 63:1298-1306. Culture conditions are also standard, e.g., cultures are incubated at 28° C. in shaker cultures or fermenters until desired levels of variant CBH I expression are achieved. Preferred culture conditions for a given filamentous fungus may be found in the scientific literature and/or from the source of the fungi such as the American Type Culture Collection (ATCC). After fungal growth has been established, the cells are exposed to conditions effective to cause or permit the expression of a variant CBH I.

In cases where a variant CBH I coding sequence is under the control of an inducible promoter, the inducing agent, e.g., a sugar, metal salt or antibiotics, is added to the medium at a concentration effective to induce variant CBH I expression.

In one embodiment, the recombinant cell is an Aspergillus niger, which is a useful strain for obtaining overexpressed polypeptide. For example A. niger var. awamori dgr246 is known to product elevated amounts of secreted cellulases (Goedegebuur et al., 2002, Curr. Genet. 41:89-98). Other strains of Aspergillus niger var awamori such as GCDAP3, GCDAP4 and GAP3-4 are known (Ward et al., 1993, Appl. Microbiol. Biotechnol. 39:738-743).

In another embodiment, the recombinant cell is a Trichoderma reesei, which is a useful strain for obtaining overexpressed polypeptide. For example, RL-P37, described by Sheir-Neiss et al., 1984, Appl. Microbiol. Biotechnol. 20:46-53, is known to secrete elevated amounts of cellulase enzymes. Functional equivalents of RL-P37 include Trichoderma reesei strain RUT-C30 (ATCC No. 56765) and strain QM9414 (ATCC No. 26921). It is contemplated that these strains would also be useful in overexpressing variant CBH I polypeptides.

Cells expressing the variant CBH I polypeptides of the disclosure can be grown under batch, fed-batch or continuous fermentations conditions. Classical batch fermentation is a closed system, wherein the compositions of the medium is set at the beginning of the fermentation and is not subject to artificial alternations during the fermentation. A variation of the batch system is a fed-batch fermentation in which the substrate is added in increments as the fermentation progresses. Fed-batch systems are useful when catabolite repression is likely to inhibit the metabolism of the cells and where it is desirable to have limited amounts of substrate in the medium. Batch and fed-batch fermentations are common and well known in the art. Continuous fermentation is an open system where a defined fermentation medium is added continuously to a bioreactor and an equal amount of conditioned medium is removed simultaneously for processing. Continuous fermentation generally maintains the cultures at a constant high density where cells are primarily in log phase growth. Continuous fermentation systems strive to maintain steady state growth conditions. Methods for modulating nutrients and growth factors for continuous fermentation processes as well as techniques for maximizing the rate of product formation are well known in the art of industrial microbiology.

Recombinant Expression in Plants

The disclosure provides transgenic plants and seeds that recombinantly express a variant CBH I polypeptide. The disclosure also provides plant products, e.g., oils, seeds, leaves, extracts and the like, comprising a variant CBH I polypeptide.

The transgenic plant can be dicotyledonous (a dicot) or monocotyledonous (a monocot). The disclosure also provides methods of making and using these transgenic plants and seeds. The transgenic plant or plant cell expressing a variant CBH I can be constructed in accordance with any method known in the art. See, for example, U.S. Pat. No. 6,309,872. T. reesei CBH I has been successfully expressed in transgenic tobacco (Nicotiana tabaccum) and potato (Solanum tuberosum). See Hooker et al., 2000, in Glycosyl Hydrolases for Biomass Conversion, ACS Symposium Series, Vol. 769, Chapter 4, pp. 55-90.

In a particular aspect, the present disclosure provides for the expression of CBH I variants in transgenic plants or plant organs and methods for the production thereof. DNA expression constructs are provided for the transformation of plants with a nucleic acid encoding the variant CBH I polypeptide, preferably under the control of regulatory sequences which are capable of directing expression of the variant CBH I polypeptide. These regulatory sequences include sequences capable of directing transcription in plants, either constitutively, or in stage and/or tissue specific manners.

The expression of variant CBH I polypeptides in plants can be achieved by a variety of means. Specifically, for example, technologies are available for transforming a large number of plant species, including dicotyledonous species (e.g., tobacco, potato, tomato, Petunia, Brassica) and monocot species. Additionally, for example, strategies for the expression of foreign genes in plants are available. Additionally still, regulatory sequences from plant genes have been identified that are serviceable for the construction of chimeric genes that can be functionally expressed in plants and in plant cells (e.g., Klee, 1987, Ann. Rev. of Plant Phys. 38:467-486; Clark et al., 1990, Virology 179(2):640-7; Smith et al., 1990, Mol. Gen. Genet. 224(3):477-81.

The introduction of nucleic acids into plants can be achieved using several technologies including transformation with Agrobacterium tumefaciens or Agrobacterium rhizogenes. Non-limiting examples of plant tissues that can be transformed include protoplasts, microspores or pollen, and explants such as leaves, stems, roots, hypocotyls, and cotyls. Furthermore, DNA encoding a variant CBH I can be introduced directly into protoplasts and plant cells or tissues by microinjection, electroporation, particle bombardment, and direct DNA uptake.

Variant CBH I polypeptides can be produced in plants by a variety of expression systems. For instance, the use of a constitutive promoter such as the 35S promoter of Cauliflower Mosaic Virus (Guilley et al., 1982, Cell 30:763-73) is serviceable for the accumulation of the expressed protein in virtually all organs of the transgenic plant. Alternatively, promoters that are tissue-specific and/or stage-specific can be used (Higgins, 1984, Annu Rev. Plant Physiol. 35:191-221; Shotwell and Larkins, 1989, In: The Biochemistry of Plants Vol. 15 (Academic Press, San Diego: Stumpf and Conn, eds.), p. 297), permit expression of variant CBH I polypeptides in a target tissue and/or during a desired stage of development.

Compositions of Variant CBH I Polypeptides

In general, a variant CBH I polypeptide produced in cell culture is secreted into the medium and may be purified or isolated, e.g., by removing unwanted components from the cell culture medium. However, in some cases, a variant CBH I polypeptide may be produced in a cellular form necessitating recovery from a cell lysate. In such cases the variant CBH I polypeptide is purified from the cells in which it was produced using techniques routinely employed by those of skill in the art. Examples include, but are not limited to, affinity chromatography (Van Tilbeurgh et al., 1984, FEBS Lett. 169(2):215-218), ion-exchange chromatographic methods (Goyal et al., 1991, Bioresource Technology, 36:37-50; Fliess et al., 1983, Eur. J. Appl. Microbiol. Biotechnol. 17:314-318; Bhikhabhai et al., 1984, J. Appl. Biochem. 6:336-345; Ellouz et al., 1987, Journal of Chromatography, 396:307-317), including ion-exchange using materials with high resolution power (Medve et al., 1998, J. Chromatography A, 808:153-165), hydrophobic interaction chromatography (Tomaz and Queiroz, 1999, J. Chromatography A, 865:123-128), and two-phase partitioning (Brumbauer et al., 1999, Bioseparation 7:287-295).

The variant CBH I polypeptides of the disclosure are suitably used in cellulase compositions. Cellulases are known in the art as enzymes that hydrolyze cellulose (beta-1,4-glucan or beta D-glucosidic linkages) resulting in the formation of glucose, cellobiose, cellooligosaccharides, and the like. Cellulase enzymes have been traditionally divided into three major classes: endoglucanases (“EG”), exoglucanases or cellobiohydrolases (EC 3.2.1.91) (“CBH”) and beta-glucosidases (EC 3.2.1.21) (“BG”) (Knowles et al., 1987, TIBTECH 5:255-261; Schulein, 1988, Methods in Enzymology 160(25):234-243).

Certain fungi produce complete cellulase systems which include exo-cellobiohydrolases or CBH-type cellulases, endoglucanases or EG-type cellulases and β-glucosidases or BG-type cellulases (Schulein, 1988, Methods in Enzymology 160(25):234-243). Such cellulase compositions are referred to herein as “whole” cellulases. However, sometimes these systems lack CBH-type cellulases and bacterial cellulases also typically include little or no CBH-type cellulases. In addition, it has been shown that the EG components and CBH components synergistically interact to more efficiently degrade cellulose. See, e.g., Wood, 1985, Biochemical Society Transactions 13(2):407-410.

The cellulase compositions of the disclosure typically include, in addition to a variant CBH I polypeptide, one or more cellobiohydrolases, endoglucanases and/or β-glucosidases. In their crudest form, cellulase compositions contain the microorganism culture that produced the enzyme components. “Cellulase compositions” also refers to a crude fermentation product of the microorganisms. A crude fermentation is preferably a fermentation broth that has been separated from the microorganism cells and/or cellular debris (e.g., by centrifugation and/or filtration). In some cases, the enzymes in the broth can be optionally diluted, concentrated, partially purified or purified and/or dried. The variant CBH I polypeptide can be co-expressed with one or more of the other components of the cellulase composition or it can be expressed separately, optionally purified and combined with a composition comprising one or more of the other cellulase components.

When employed in cellulase compositions, the variant CBH I is generally present in an amount sufficient to allow release of soluble sugars from the biomass. The amount of variant CBH I enzymes added depends upon the type of biomass to be saccharified which can be readily determined by the skilled artisan. In certain embodiments, the weight percent of variant CBH I polypeptide is suitably at least 1, at least 5, at least 10, or at least 20 weight percent of the total polypeptides in a cellulase composition. Exemplary cellulase compositions include a variant CBH I of the disclosure in an amount ranging from about 1 to about 20 weight percent, from about 1 to about 25 weight percent, from about 5 to about 20 weight percent, from about 5 to about 25 weight percent, from about 5 to about 30 weight percent, from about 5 to about 35 weight percent, from about 5 to about 40 weight percent, from about 5 to about 45 weight percent, from about 5 to about 50 weight percent, from about 10 to about 20 weight percent, from about 10 to about 25 weight percent, from about 10 to about 30 weight percent, from about 10 to about 35 weight percent, from about 10 to about 40 weight percent, from about 10 to about 45 weight percent, from about 10 to about 50 weight percent, from about 15 to about 20 weight percent, from about 15 to about 25 weight percent, from about 15 to about 30 weight percent, from about 15 to about 35 weight percent, from about 15 to about 30 weight percent, from about 15 to about 45 weight percent, or from about 15 to about 50 weight percent of the total polypeptides in the composition.

Utility of Variant CBH I Polypeptides

It can be appreciated that the variant CBH I polypeptides of the disclosure and compositions comprising the variant CBH I polypeptides find utility in a wide variety applications, for example detergent compositions that exhibit enhanced cleaning ability, function as a softening agent and/or improve the feel of cotton fabrics (e.g., “stone washing” or “biopolishing”), or in cellulase compositions for degrading wood pulp into sugars (e.g., for bio-ethanol production). Other applications include the treatment of mechanical pulp (Pere et al., 1996, Tappi Pulping Conference, pp. 693-696 (Nashville, Tenn., Oct. 27-31, 1996)), for use as a feed additive (see, e.g., WO 91/04673) and in grain wet milling.

Saccharification Reactions

Ethanol can be produced via saccharification and fermentation processes from cellulosic biomass such as trees, herbaceous plants, municipal solid waste and agricultural and forestry residues. However, the ratio of individual cellulase enzymes within a naturally occurring cellulase mixture produced by a microbe may not be the most efficient for rapid conversion of cellulose in biomass to glucose. It is known that endoglucanases act to produce new cellulose chain ends which themselves are substrates for the action of cellobiohydrolases and thereby improve the efficiency of hydrolysis of the entire cellulase system. The use of optimized cellobiohydrolase activity may greatly enhance the production of ethanol.

Cellulase compositions comprising one or more of the variant CBH I polypeptides of the disclosure can be used in saccharification reaction to produce simple sugars for fermentation. Accordingly, the present disclosure provides methods for saccharification comprising contacting biomass with a cellulase composition comprising a variant CBH I polypeptide of the disclosure and, optionally, subjecting the resulting sugars to fermentation by a microorganism.

The term “biomass,” as used herein, refers to any composition comprising cellulose (optionally also hemicellulose and/or lignin). As used herein, biomass includes, without limitation, seeds, grains, tubers, plant waste or byproducts of food processing or industrial processing (e.g., stalks), corn (including, e.g., cobs, stover, and the like), grasses (including, e.g., Indian grass, such as Sorghastrum nutans; or, switchgrass, e.g., Panicum species, such as Panicum virgatum), wood (including, e.g., wood chips, processing waste), paper, pulp, and recycled paper (including, e.g., newspaper, printer paper, and the like). Other biomass materials include, without limitation, potatoes, soybean (e.g., rapeseed), barley, rye, oats, wheat, beets, and sugar cane bagasse.

The saccharified biomass (e.g., lignocellulosic material processed by enzymes of the disclosure) can be made into a number of bio-based products, via processes such as, e.g., microbial fermentation and/or chemical synthesis. As used herein, “microbial fermentation” refers to a process of growing and harvesting fermenting microorganisms under suitable conditions. The fermenting microorganism can be any microorganism suitable for use in a desired fermentation process for the production of bio-based products. Suitable fermenting microorganisms include, without limitation, filamentous fungi, yeast, and bacteria. The saccharified biomass can, for example, be made it into a fuel (e.g., a biofuel such as a bioethanol, biobutanol, biomethanol, a biopropanol, a biodiesel, a jet fuel, or the like) via fermentation and/or chemical synthesis. The saccharified biomass can, for example, also be made into a commodity chemical (e.g., ascorbic acid, isoprene, 1,3-propanediol), lipids, amino acids, polypeptides, and enzymes, via fermentation and/or chemical synthesis.

Thus, in certain aspects, the variant CBH I polypeptides of the disclosure find utility in the generation of ethanol from biomass in either separate or simultaneous saccharification and fermentation processes. Separate saccharification and fermentation is a process whereby cellulose present in biomass is saccharified into simple sugars (e.g., glucose) and the simple sugars subsequently fermented by microorganisms (e.g., yeast) into ethanol. Simultaneous saccharification and fermentation is a process whereby cellulose present in biomass is saccharified into simple sugars (e.g., glucose) and, at the same time and in the same reactor, microorganisms (e.g., yeast) ferment the simple sugars into ethanol.

Prior to saccharification, biomass is preferably subject to one or more pretreatment step(s) in order to render cellulose material more accessible or susceptible to enzymes and thus more amenable to hydrolysis by the variant CBH I polypeptides of the disclosure.

In an exemplary embodiment, the pretreatment entails subjecting biomass material to a catalyst comprising a dilute solution of a strong acid and a metal salt in a reactor. The biomass material can, e.g., be a raw material or a dried material. This pretreatment can lower the activation energy, or the temperature, of cellulose hydrolysis, ultimately allowing higher yields of fermentable sugars. See, e.g., U.S. Pat. Nos. 6,660,506; 6,423,145.

Another exemplary pretreatment method entails hydrolyzing biomass by subjecting the biomass material to a first hydrolysis step in an aqueous medium at a temperature and a pressure chosen to effectuate primarily depolymerization of hemicellulose without achieving significant depolymerization of cellulose into glucose. This step yields a slurry in which the liquid aqueous phase contains dissolved monosaccharides resulting from depolymerization of hemicellulose, and a solid phase containing cellulose and lignin. The slurry is then subject to a second hydrolysis step under conditions that allow a major portion of the cellulose to be depolymerized, yielding a liquid aqueous phase containing dissolved/soluble depolymerization products of cellulose. See, e.g., U.S. Pat. No. 5,536,325.

A further exemplary method involves processing a biomass material by one or more stages of dilute acid hydrolysis using about 0.4% to about 2% of a strong acid; followed by treating the unreacted solid lignocellulosic component of the acid hydrolyzed material with alkaline delignification. See, e.g., U.S. Pat. No. 6,409,841. Another exemplary pretreatment method comprises prehydrolyzing biomass (e.g., lignocellulosic materials) in a prehydrolysis reactor; adding an acidic liquid to the solid lignocellulosic material to make a mixture; heating the mixture to reaction temperature; maintaining reaction temperature for a period of time sufficient to fractionate the lignocellulosic material into a solubilized portion containing at least about 20% of the lignin from the lignocellulosic material, and a solid fraction containing cellulose; separating the solubilized portion from the solid fraction, and removing the solubilized portion while at or near reaction temperature; and recovering the solubilized portion. The cellulose in the solid fraction is rendered more amenable to enzymatic digestion. See, e.g., U.S. Pat. No. 5,705,369. Further pretreatment methods can involve the use of hydrogen peroxide H₂O₂. See Gould, 1984, Biotech, and Bioengr. 26:46-52.

Pretreatment can also comprise contacting a biomass material with stoichiometric amounts of sodium hydroxide and ammonium hydroxide at a very low concentration. See Teixeira et al., 1999, Appl. Biochem. and Biotech. 77-79:19-34. Pretreatment can also comprise contacting a lignocellulose with a chemical (e.g., a base, such as sodium carbonate or potassium hydroxide) at a pH of about 9 to about 14 at moderate temperature, pressure, and pH. See PCT Publication WO2004/081185.

Ammonia pretreatment can also be used. Such a pretreatment method comprises subjecting a biomass material to low ammonia concentration under conditions of high solids. See, e.g., U.S. Patent Publication No. 20070031918 and PCT publication WO 06/110901.

Detergent Compositions Comprising Variant CBH I Proteins

The present disclosure also provides detergent compositions comprising a variant CBH I polypeptide of the disclosure. The detergent compositions may employ besides the variant CBH I polypeptide one or more of a surfactant, including anionic, non-ionic and ampholytic surfactants; a hydrolase; a bleaching agents; a bluing agent; a caking inhibitors; a solubilizer; and a cationic surfactant. All of these components are known in the detergent art.

The variant CBH I polypeptide is preferably provided as part of cellulase composition. The cellulase composition can be employed from about 0.00005 weight percent to about 5 weight percent or from about 0.0002 weight percent to about 2 weight percent of the total detergent composition. The cellulase composition can be in the form of a liquid diluent, granule, emulsion, gel, paste, and the like. Such forms are known to the skilled artisan. When a solid detergent composition is employed, the cellulase composition is preferably formulated as granules.

Examples Materials and Methods Preparation of CBH I Polypeptides for Biochemical Characterization

Protein expression was carried out in an Aspergillus niger host strain that had been transformed using PEG-mediated transformation with expression constructs for CBHI that included the hygromycin resistance gene as a selectable marker, in which the full length CBH I sequences (signal sequence, catalytic domain, linker and cellulose binding domain) were under the control of the glyceraldeyhde-3-phosphate dehydrogenase (gpd) promoter. Transformants were selected on the regeneration medium based on resistance to hygromycin. The selected transformants were cultured in Aspergillus salts medium, pH 6.2 supplemented with the antibiotics penicillin, streptomycin, and hygromycin, and 80 g/L glycerol, 20 g/L soytone, 10 mM uridine, 20 g/L MES) in baffled shake flasks at 30° C., 170 rpm. After five days of incubation, the total secreted protein supernatant was recovered, and then subjected to hollow fiber filtration to concentrate and exchange the sample into acetate buffer (50 mM NaAc, pH 5). CBH I protein represented over 90% of the total protein in these samples. Protein purity was analyzed by SDS-PAGE. Protein concentration was determined by gel densitometry and/or HPLC analysis. All CBH I protein concentrations were normalized before assay and concentrated to 1-2.5 mg/ml.

CBH I Activity Assays

4-Methylumbelliferyl Lactoside (4-MUL) Assay:

This assay measures the activity of CBH I on the fluorogenic substrate 4-MUL (also known as MUL). Assays were run in a costar 96-well black bottom plate, where reactions were initiated by the addition of 4-MUL to enzyme in buffer (2 mM 4-MUL in 200 mM MES pH 6). Enzymatic rates were monitored by fluorescent readouts over five minutes on a SPECTRAMAX™ plate reader (ex/em 365/450 nm). Data in the linear range was used to calculate initial rates (Vo).

Phosphoric Acid Swollen Cellulose (PASC) Assay:

This assay measures the activity of CBH I using PASC as the substrate. During the assay, the concentration of PASC is monitored by a fluorescent signal derived from calcofluor binding to PASC (ex/em 365/440 nm). The assay is initiated by mixing enzyme (15 μl) and reaction buffer (85 μl of 0.2% PASC, 200 mM MES, pH 6), and then incubating at 35° C. while shaking at 225 RPM. After 2 hours, one reaction volume of calcofluor stop solution (100 μg/ml in 500 mM glycine pH 10) is added and fluorescence read-outs obtained (ex/em 365/440 nm).

Bagasse Assay:

This assay measures the activity of CBH I on bagasse, a lignocellulosic substrate. Reactions were run in 10 ml vials with 5% dilute acid pretreated bagasse (250 mg solids per 5 ml reaction). Each reaction contained 4 mg CBH I enzyme/g solids, 200 mM MES pH 6, kanamycin, and chloramphenicol. Reactions were incubated at 35° C. in hybridization incubators (Robbins Scientific), rotating at 20 RPM. Time points were taken by transferring a sample of homogenous slurry (150 μl) into a 96-well deep well plate and quenching the reaction with stop buffer (450 μl of 500 mM sodium carbonate, pH 10). Time point measurements were taken every 24 hours for 72 hours.

Cellobiose Tolerance Assays (or Cellobiose Inhibition Assays):

Tolerance to cellobiose (or inhibition caused by cellobiose) was tested in two ways in the CBH I assays. A direct-dose tolerance method can be applied to all of the CBH I assays (i.e., 4-MUL, PASC, and/or bagasse assays), and entails the exogenous addition of a known amount of cellobiose into assay mixtures. A different indirect method entails the addition of an excess amount of β-glucosidase (BG) to PASC and bagasse assays (typically, 1 mg β-glucosidase/g solids loaded). BG will enzymatically hydrolyze the cellobiose generated during these assays; therefore, CBH I activity in the presence of BG can be taken as a measure of activity in the absence of cellobiose. Furthermore, when activity in the presence and absence of BG are similar, this indicates tolerance to cellobiose. Notably, in cases where BG activity is undesired, but may be present in crude CBH I enzyme preparations, the BG inhibitor gluconolactone can be added into CBH I assays to prevent cellobiose breakdown.

Library Screening Assays

The wild type CBH I polypeptide BD29555 was mutagenized to identify variants with improved product tolerance. A small (60-member) library of BD29555 variants was designed to identify variant CBH I polypeptides with reduced product inhibition. This product-release-site library was designed based on residues directly interacting with the cellobiose product in an attempt to identify variants with weakened interactions with cellobiose from which the product would be released more readily than the wild type enzyme. The 60-member evolution library contained wild-type residues and mutations at positions B273, W405, and R422 of BD29555 (SEQ ID NO:1), and included the following substitutions: B273 (WT), R273Q, R273K, R273A, W405 (WT), W405Q, W405H, R422 (WT), R422Q, R422K, R422L, and R422E (4 variants at position 273×3 variants at position 405×5 variants at position 422 equals 60 variants in total). All members of the library were screened using the 4-MUL assay in the presence and absence of 250 g/L cellobiose and using gluconolactone to inhibit any BG activity. The R273A, R273Q, and R273K/R422K variants showed enhanced product tolerance. The R273K/R422K variant showed greatest activity among the variants and cellobiose tolerance at 250 mg/L. Due to low expression, the R273K variant was not tested for product inhibition.

Characterization of Product Tolerant VARIANTS of BD29555

The R273K/R422K substitutions were characterized in both a wild type BD29555 background and also in combination with the substitutions Y274Q, D281K, Y410H, P411G, which were identified in a screen of an expanded product release site evolution library.

The wild type, the R273K/R422K variant and the R273K/Y274Q/D281K/Y410H/P411G/R422K variants were tested for activity on 4-MUL in the presence and absence of 250 mg/L cellobiose, and the R273K/R422K variant was also tested in the bagasse assay in the presence and absence of BG. The results are summarized in Table 5.

The results from these activity assays were converted into the percentage of activity remaining with and without cellobiose present, where values close to 100% indicated cellobiose tolerance. The percent of activity remaining in the MUL assay in the presence cellobiose versus in the absence of cellobiose shows that the R273K/R422K variant was the most tolerant, followed by the R273K/Y274Q/D281K/Y410H/P411G/R422K variant, and then wild-type, at 95%, 78%, and 25% activity, respectively.

Cellobiose dose response curves of the wild-type and R273K/R422K variant of BD29555 were obtained during the 4-MUL assay. Enzyme rates (Vo) were measured in the presence of different concentrations of cellobiose (200 mM MES pH 6, 25° C.). Rates were measured in quadruplicate. The results are shown in FIG. 1A-1B. FIG. 1A shows that wild type BD2955 is inhibited by cellobiose, with a half maximal inhibitory concentration (IC₅₀ value) of 60 mg/L. FIG. 1B shows that the R273K/R422K variant is tolerant to cellobiose up to 250 mg/L.

The bagasse assay results shown in Table 5, which lists the percentage of activity remaining in the absence vs. presence of BG, also demonstrate that the percentage activity of the wild type BD29555 is lower than the percentage activity of the R273K/R422K variant, indicating that the R273K/R422K variant is less sensitive to the presence of cellobiose than the wild type. FIG. 2A-2B shows bar graph data for the bagasse assay of BD29555 vs. the R273K/R422K variant. In FIG. 2A, bars represent relative activity, which has been normalized to wild type activity in the absence of cellobiose (WT+BG=uninhibited activity=1). In FIG. 2B, bars indicate tolerance to cellobiose, as represented by the ratio of activity in the presence of cellobiose (−BG) to that of activity in the absence of cellobiose (+BG); ratios close to 1 indicate greater tolerance to cellobiose. These data again demonstrate that the R273K/R422K variant of BD29555 is more tolerant to cellobiose than the wild tvae BD29555.

The wild type and R273K/R422K variant were also characterized in the PASC assay. Results are shown in FIG. 3. The activities of both wild type BD29555 (SEQ ID NO:1) and wild type T. reesei CBH I (SEQ ID NO:2) were inhibited by cellobiose concentrations starting around 1 g/L (with IC₅₀ values of 2.2 and 3 g/L, respectively), whereas the R273K/R422K variant showed little inhibition in the presence of 10 g/L cellobiose.

Characterization of Product Tolerant VARIANTS of T. reesei CBH I

Cellobiose product tolerant substitutions were introduced into T. reesei CBH I (SEQ ID NO:2). A panel of variants with single and double alanine and lysine substitutions at R268 and R411 were expressed and analyzed. The variants were tested for activity on 4-MUL in the presence and absence of 250 mg/L cellobiose and also in the bagasse assay in the absence and prseence of BG. The results from these assays were converted into the percentage activity remaining in the presence and absence of cellobiose and BG, respectively. Values are summarized in Table 6.

The 4-MUL assay results shown in Table 6 demonstrate that the activity of the wild type T. reesei CBH I was reduced to 23% in the presence of cellobiose, whereas the double mutants at R268 and R411 retained more than 90% of their activity under the same conditions.

The bagasse assay results shown in Table 6 demonstrate that the activity of the wild type T. reesei CBH I is more significantly impacted by the presence of BG than is the activity of the single or double substitution variants, indicating that the variants are less sensitive to the accumulation of cellobiose than the wild type. FIGS. 4 and 5 show bar graph data for the bagasse assay of wild type T. reesei CBH I vs. the variants. In FIG. 4, bars represent relative activity, normalized to wild type activity in the absence of cellobiose (WT+BG=1). In FIG. 5, bars represent tolerance to cellobiose, as represented by the ratio of activity in the presence of accumulating cellobiose (−BG) to that of activity in the absence of cellobiose (+BG); ratios close to 1 indicate greater tolerance to cellobiose.

Specific Embodiments and Incorporation by Reference

All publications, patents, patent applications and other documents cited in this application are hereby incorporated by reference in their entireties for all purposes to the same extent as if each individual publication, patent, patent application or other document were individually indicated to be incorporated by reference for all purposes.

While various specific embodiments have been illustrated and described, it will be appreciated that various changes can be made without departing from the spirit and scope of the invention(s).

TABLE 1 Sequence Identifier Database (SEQ ID NO:) Accession Number Species of Origin Amino Acid Sequence BD29555* Unknown MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN TSTNCYTGNT WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSHLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSSNN ANTGLGNHGA CCAELDIWEA NSISEALTPH PCDTPGLSVC TTDACGGTYS SDRYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPITV VTQFVTDDGT STGTLSEIRR YYVQNGVVIP QPSSKISGVS GNVINSDFCD AEISTFGETA SFSKHGGLAK MGAGMEAGMV LVMSLWDDYS VNMLWLDSTY PTNATGTPGA ARGSCPTTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSSTT TASGTTTTKA SSTSTSSTST GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL 340514556 Trichoderma reesei MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGSYSG NELNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF GPIGSTGNPS GGNPPGGNPP GTTTTRRPAT TTGSSPGPTQ SHYGQCGGIG YSGPTVCASG TTCQVLNPYY SQCL 51243029 Penicillium occitanis MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN TSTNCYTGNT WNSAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSHLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSANN ANTGIGNHGA CCAELDIWEA NSISEALTPH PCDTPGLSVC TTDACGGTYS SDRYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPFTV VTQFVTNDGT STGSLSEIRR YYVQNGVVIP QPSSKISGIS GNVINSDYCA AEISTFGGTA SFNKHGGLTN MAAGMEAGMV LVMSLWDDYA VNMLWLDSTY PTNATGTPGA ARGTCATTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSTTT TASRTTTTSA SSTSTSSTST GTGVAGHWGQ CGGQGWTGPT TCVSGTTCTV VNPYYSQCL 7cel (PDB) & Trichoderma reesei ESACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN CCLDGAAYAS TYGVTTSGNS LSIDFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWQANS ISEALTPHPC TTVGQEICEG DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG NPSG 67516425 Aspergillus nidulans MASSFQLYKA LLFFSSLLSA VQAQKVGTQQ AEVHPGLTWQ TCTSSGSCTT VNGEVTIDAN WRWLHTVNGY FGSC A4 TNCYTGNEWD TSICTSNEVC AEQCAVDGAN YASTYGITTS GSSLRLNFVT QSQQKNIGSR VYLMDDEDTY TMFYLLNKEF TFDVDVSELP CGLNGAVYFV SMDADGGKSR YATNEAGAKY GTGYCDSQCP RDLKFINGVA NVEGWESSDT NPNGGVGNHG SCCAEMDIWE ANSISTAFTP HPCDTPGQTL CTGDSCGGTY SNDRYGGTCD PDGCDFNSYR QGNKTFYGPG LTVDTNSPVT VVTQFLTDDN TDTGTLSEIK RFYVQNGVVI PNSESTYPAN PGNSITTEFC ESQKELFGDV DVFSAHGGMA GMGAALEQGM VLVLSLWDDN YSNMLWLDSN YPTDADPTQP GIARGTCPTD SGVPSEVEAQ YPNAYVVYSN IKFGPIGSTF GNGGGSGPTT TVTTSTATST TSSATSTATG QAQHWEQCGG NGWTGPTVCA SPWACTVVNS WYSQCL 46107376 Gibberella zeae PH-1 MYRAIATASA LIAAVRAQQV CSLTQESKPS LNWSKCTSSG CSNVKGSVTI DANWRWTHQV SGSTNCYTGN KWDTSVCTSG KVCAEKCCLD GADYASTYGI TSSGDQLSLS FVTKGPYSTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSDSDVNGGI GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGNSLTA DFCTKQKKVF NDPDDFTKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKSDGTTPTN PTNPSEPSNT ANPNPGTVDQ WGQCGGSNYS GPTACKSGFT CKKINDFYSQ CQ 70992391 Aspergillus fumigatus MLASTFSYRM YKTALILAAL LGSGQAQQVG TSQAEVHPSM TWQSCTAGGS CTTNNGKVVI DANWRWVHKV Af293 GDYTNCYTGN TWDTTICPDD ATCASNCALE GANYESTYGV TASGNSLRLN FVTTSQQKNI GSRLYMMKDD STYEMFKLLN QEFTFDVDVS NLPCGLNGAL YFVAMDADGG MSKYPTNKAG AKYGTGYCDS QCPRDLKFIN GQANVEGWQP SSNDANAGTG NHGSCCAEMD IWEANSISTA FTPHPCDTPG QVMCTGDACG GTYSSDRYGG TCDPDGCDFN SFRQGNKTFY GPGMTVDTKS KFTVVTQFIT DDGTSSGTLK EIKRFYVQNG KVIPNSESTW TGVSGNSITT EYCTAQKSLF QDQNVFEKHG GLEGMGAALA QGMVLVMSLW DDHSANMLWL DSNYPTTASS TTPGVARGTC DISSGVPADV EANHPDAYVV YSNIKVGPIG STFNSGGSNP GGGTTTTTTT QPTTTTTTAG NPGGTGVAQH YGQCGGIGWT GPTTCASPYT CQKLNDYYSQ CL 121699984 Aspergillus clavatus MLPSTISYRI YKNALFFAAL FGAVQAQKVG TSKAEVHPSM AWQTCAADGT CTTKNGKVVI DANWRWVHDV NRRL 1 KGYTNCYTGN TWNAELCPDN ESCAENCALE GADYAATYGA TTSGNALSLK FVTQSQQKNI GSRLYMMKDD NTYETFKLLN QEFTFDVDVS NLPCGLNGAL YFVSMDADGG LSRYTGNEAG AKYGTGYCDS QCPRDLKFIN GLANVEGWTP SSSDANAGNG GHGSCCAEMD IWEANSISTA YTPHPCDTPG QAMCNGDSCG GTYSSDRYGG TCDPDGCDFN SYRQGNKSFY GPGMTVDTKK KMTVVTQFLT NDGTATGTLS EIKRFYVQDG KVIANSESTW PNLGGNSLTN DFCKAQKTVF GDMDTFSKHG GMEGMGAALA EGMVLVMSLW DDHNSNMLWL DSNSPTTGTS TTPGVARGSC DISSGDPKDL EANHPDASVV YSNIKVGPIG STFNSGGSNP GGSTTTTKPA TSTTTTKATT TATTNTTGPT GTGVAQPWAQ CGGIGYSGPT QCAAPYTCTK QNDYYSQCL 1906845 Claviceps purpurea MHPSLQTILL SALFTTAHAQ QACSSKPETH PPLSWSRCSR SGCRSVQGAV TVDANWLWTT VDGSQNCYTG NRWDTSICSS EKTCSESCCI DGADYAGTYG VTTTGDALSL KFVQQGPYSK NVGSRLYLMK DESRYEMFTL LGNEFTFDVD VSKLGCGLNG ALYFVSMDED GGMKRFPMNK AGAKFGTGYC DSQCPRDVKF INGMANSKDW IPSKSDANAG IGSLGACCRE MDIWEANNIA SAFTPHPCKN SAYHSCTGDG CGGTYSKNRY SGDCDPDGCD FNSYRLGNTT FYGPGPKFTI DTTRKISVVT QFLKGRDGSL REIKRFYVQN GKVIPNSVSR VRGVPGNSIT QGFCNAQKKM FGAHESFNAK GGMKGMSAAV SKPMVLVMSL WDDHNSNMLW LDSTYPTNSR QRGSKRGSCP ASSGRPTDVE SSAPDSTVVF SNIKFGPIGS TFSRGK 1gpi (PDB) & Phanerochaete ESACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN chrysosporium CCLDGAAYAS TYGVTTSGNS LSIDFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWQANS ISEALTPHPC TTVGQEICEG DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG NPSG 119468034 Neosartorya fischeri MHQRALLFSA LAVAANAQQV GTQKPETHPP LTWQKCTAAG SCSQQSGSVV IDANWRWLHS TKDTTNCYTG NRRL 181 NTWNTELCPD NESCAQNCAV DGADYAGTYG VTTSGSELKL SFVTGANVGS RLYLMQDDET YQHFNLLNNE FTFDVDVSNL PCGLNGALYF VAMDADGGMS KYPSNKAGAK YGTGYCDSQC PRDLKFINGM ANVEGWKPSS NDKNAGVGGH GSCCPEMDIW EANSISTAVT PHPCDDVSQT MCSGDACGGT YSATRYAGTC DPDGCDFNPF RMGNESFYGP GKIVDTKSEM TVVTQFITAD GTDTGALSEI KRLYVQNGKV IANSVSNVAD VSGNSISSDF CTAQKKAFGD EDIFAKHGGL SGMGKALSEM VLIMSIWDDH HSSMMWLDST YPTDADPSKP GVARGTCEHG AGDPEKVESQ HPDASVTFSN IKFGPIGSTY KA 7804883 Leptosphaeria MYRSLIFATS LLSLAKGQLV GNLYCKGSCT AKNGKVVIDA NWRWLHVKGG YTNCYTGNEW NATACPDNKS maculans CATNCAIDGA DYRRLRHYCE RQLLGTEVHH QGLYSTNIGS RTYLMQDDST YQLFKFTGSQ EFTFDVDLSN LPCGLNGALY FVSMDADGGL KKYPTNKAGA KYGTGYCDAQ CPRDLKFING EGNVEGWQPS KNDQNAGVGG HGSCCAEMDI WEANSVSTAV TPHSCSTIEQ SRCDGDGCGG TYSADRYAGV CDPDGCDFNS YRMGVKDFYG KGKTVDTSKK FTVVTQFIGS GDAMEIKRFY VQNGKTIPQP DSTIPGVTGN SITTFFCDAQ KKAFGDKYTF KDKGGMANMP STCNGMVLVM SLWDDHYSNM LWLDSTYPTD KNPDTDAGSG RGECAITSGV PADVESQHPD ASVIYSNIKF GPINTTFG 85108032 Neurospora crassa MLAKFAALAA LVASANAQAV CSLTAETHPS LNWSKCTSSG CTNVAGSITV DANWRWTHIT SGSTNCYSGN N150 (OR74A) EWDTSLCSTN TDCATKCCVD GAEYSSTYGI QTSGNSLSLQ FVTKGSYSTN IGSRTYLMNG ADAYQGFELL GNEFTFDVDV SGTGCGLNGA LYFVSMDLDG GKAKYTNNKA GAKYGTGYCD AQCPRDLKYI NGIANVEGWT PSTNDANAGI GDHGTCCSEM DIWEANKVST AFTPHPCTTI EQHMCEGDSC GGTYSDDRYG GTCDADGCDF NSYRMGNTTF YGEGKTVDTS SKFTVVTQFI KDSAGDLAEI KRFYVQNGKV IENSQSNVDG VSGNSITQSF CNAQKTAFGD IDDFNKKGGL KQMGKALAKP MVLVMSIWDD HAANMLWLDS TYPVEGGPGA YRGECPTTSG VPAEVEANAP NSKVIFSNIK FGPIGSTFSG GSSGTPPSNP SSSVKPVTST AKPSSTSTAS NPSGTGAAHW AQCGGIGFSG PTTCQSPYTC QKINDYYSQC V 169859458 Coprinopsis cinerea MFKKVALTAL CFLAVAQAQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY okayama TGNSWNSTVC SDPTTCAQRC ALEGANYQQT YGITTNGDAL TIKFLTRSQQ TNVGARVYLM ENENRYQMFN LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGMSKQPNN RAGAKYGTGY CDSQCPRDIK FIDGVANSAD WTPSETDPNA GRGRYGICCA EMDIWEANSI SNAYTPHPCR TQNDGGYQRC EGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTVDT NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITEQFCTDQ KNLFGDYSSF ARDGGLAHMG RSLAKGHVLA LSIWNDHGAH MLWLDSNYPT DADPNKPGIA RGTCPTTGGT PRETEQNHPD AQVIFSNIKF GDIGSTFSGY 154292161 Botryotinia fuckeliana MYSAAVLATF SFLLGAGAQQ VGTSTAETHP ALTVQKCAAG GTCTDESDSI VLDANWRWLH STSGSTNCYT B05-10 GNTWDTTLCP DAATCTTNCA LDGADYEGTY GITTSGDSLK LSFVTGSNVG SRTYLMDSET TYKEFALLGN EFTFTVDVSK LPCGLNGALY FVPMDADGGM SKYPTNKAGA KYGTGYCDAQ CPQDMKFVNG TANVEGWVPD SNSANSGTGN IGSCCSEFDV WEANSMSQAL TPHVCTVDSQ TACTGDDCAS NTGVCDGDGC DFNPYRMGNT TFYGSGMTID TSKPFSVVTQ FITDDGTETG TLTEIKRFYV QDDVVYEQPS SDISGVSGNS ITDDFCAAQK TAFGDTDYFT QNGGMAAMGK KMADGMVLVL SIWDDYNVNM LWLDSDYPTT KDASTPGVSR GSCATDSGVP ATVEAASGSA YVTFSSIKYG PIGSTFNAPA DSSSSVSASS SPAPIASSSS SASIAPVSSV VAAIVSSSAQ AISSAAPVVS SSAQAISSAA PVVSSVVSSA APVATSSTKS KCSKVSSTLK TSVAAPATSA TSAAVVATSS AASSTGSVPL YGNCTGGKTC SEGTCVVQND YYSQCVASS 169615761 # Phaeosphaeria MTWQRCTGTG GSSCTNVNGE IVIDANWRWI HATGGYTNCF DGNEWNKTAC PSNAACTKNC AIEGSDYRGT nodorum SN15 YGITTSGNSL TLKFITKGQY STNVGSRTYL MKDTNNYEMF NLIGNEFTFD VDLSQLPCGL NGALYFVSMP EKGQGTPGAK YGTGKLSQCS VHISKTLTDA CARDLKFVGG EANADGWQAS TSDPNAGVGK KGACCAEMDV WEANSMSTAL TPHSCQPEGY AVCEESNCGG TYSLDRYAGT CDANGCDFNP YRVGNKDFYG KGKTVDTSKK MTVVTQFLGT GSDLTELKRF YVQDGKVISN PEPTIPGMTG NSITQKWCDT QKEVFKEEVY PFNQWGGMAS MGKGMAQGMV LVMSLWDDHY SNMLWLDSTY PTDRDPESPG AARGECAITS GAPAEVEANN PDASVMFSNI KFGPIGSTFQ QPA 4883502 Humicola grisea MQIKSYIQYL AAALPLLSSV AAQQAGTITA ENHPRMTWKR CSGPGNCQTV QGEVVIDANW RWLHNNGQNC YEGNKWTSQC SSATDCAQRC ALDGANYQST YGASTSGDSL TLKFVTKHEY GTNIGSRFYL MANQNKYQMF TLMNNEFAFD VDLSKVECGI NSALYFVAME EDGGMASYPS NRAGAKYGTG YCDAQCARDL KFIGGKANIE GWRPSTNDPN AGVGPMGACC AEIDVWESNA YAYAFTPHAC GSKNRYHICE TNNCGGTYSD DRFAGYCDAN GCDYNPYRMG NKDFYGKGKT VDTNRKFTVV SRFERNRLSQ FFVQDGRKIE VPPPTWPGLP NSADITPELC DAQFRVFDDR NRFAETGGFD ALNEALTIPM VLVMSIWDDH HSNMLWLDSS YPPEKAGLPG GDRGPCPTTS GVPAEVEAQY PNAQVVWSNI RFGPIGSTVN V 950686 Humicola grisea MRTAKFATLA ALVASAAAQQ ACSLTTERHP SLSWKKCTAG GQCQTVQASI TLDSNWRWTH QVSGSTNCYT GNKWDTSICT DAKSCAQNCC VDGADYTSTY GITTNGDSLS LKFVTKGQYS TNVGSRTYLM DGEDKYQTFE LLGNEFTFDV DVSNIGCGLN GALYFVSMDA DGGLSRYPGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WTGSTNDPNA GAGRYGTCCS EMDIWEANNM ATAFTPHPCT IIGQSRCEGD SCGGTYSNER YAGVCDPDGC DFNSYRQGNK TFYGKGMTVD TTKKITVVTQ FLKDANGDLG EIKRFYVQDG KIIPNSESTI PGVEGNSITQ DWCDRQKVAF GDIDDFNRKG GMKQMGKALA GPMVLVMSIW DDHASNMLWL DSTFPVDAAG KPGAERGACP TTSGVPAEVE AEAPNSNVVF SNIRFGPIGS TVAGLPGAGN GGNNGGNPPP PTTTTSSAPA TTTTASAGPK AGRWQQCGGI GFTGPTQCEE PYTCTKLNDW YSQCL 124491660 Chaetomium MQIKQYLQYL AAALPLVNMA AAQRAGTQQT ETHPRLSWKR CSSGGNCQTV NAEIVIDANW RWLHDSNYQN thermophilum CYDGNRWTSA CSSATDCAQK CYLEGANYGS TYGVSTSGDA LTLKFVTKHE YGTNIGSRVY LMNGSDKYQM FTLMNNEFAF DVDLSKVECG LNSALYFVAM EEDGGMRSYS SNKAGAKYGT GYCDAQCARD LKFVGGKANI EGWRPSTNDA NAGVGPYGAC CAEIDVWESN AYAFAFTPHG CLNNNYHVCE TSNCGGTYSE DRFGGLCDAN GCDYNPYRMG NKDFYGKGKT VDTSRKFTVV TRFEENKLTQ FFIQDGRKID IPPPTWPGLP NSSAITPELC TNLSKVFDDR DRYEETGGFR TINEALRIPM VLVMSIWDGH YANMLWLDSV YPPEKAGQPG AERGPCAPTS GVPAEVEAQF PNAQVIWSNI RFGPIGSTYQ V 58045187 Chaetomium MMYKKFAALA ALVAGAAAQQ ACSLTTETHP RLTWKRCTSG GNCSTVNGAV TIDANWRWTH TVSGSTNCYT thermophilum GNEWDTSICS DGKSCAQTCC VDGADYSSTY GITTSGDSLN LKFVTKHQHG TNVGSRVYLM ENDTKYQMFE LLGNEFTFDV DVSNLGCGLN GALYFVSMDA DGGMSKYSGN KAGAKYGTGY CDAQCPRDLK FINGEANIEN WTPSTNDANA GFGRYGSCCS EMDIWDANNM ATAFTPHPCT IIGQSRCEGN SCGGTYSSER YAGVCDPDGC DFNAYRQGDK TFYGKGMTVD TTKKMTVVTQ FHKNSAGVLS EIKRFYVQDG KIIANAESKI PGNPGNSITQ EWCDAQKVAF GDIDDFNRKG GMAQMSKALE GPMVLVMSVW DDHYANMLWL DSTYPIDKAG TPGAERGACP TTSGVPAEIE AQVPNSNVIF SNIRFGPIGS TVPGLDGSTP SNPTATVAPP TSTTTSVRSS TTQISTPTSQ PGGCTTQKWG QCGGIGYTGC TNCVAGTTCT ELNPWYSQCL 169601100 # Phaeosphaeria MYRNFLYAAS LLSVARSQLV GTQTTETHPG MTWQSCTAKG SCTTCSDNKA CASNCAVDGA DYKGTYGITA nodorum SN15 SGNSLQLKFI TKGSYSTNIG SRTYLMASDT AYQMFKFDGN KEFTFDVDLS GLPCGFNGAL YFVSMDEDGG LKKYSGNKAG AKYGTGYCDA QCPRDLKFIN GEGNVEGWKP SDNDANAGVG GHGSCCAEMD IWEANSISTA VTPHACSTIE QTRCDGDGCG GTYSADRYAG VCDPDGCDFN AYRMGVKNFY GKGMTVDTSK KFTVVTQFIG TGDAMEIKRF YVQGGKTIEQ PASTIPGVEG NSITTKFCDQ QKQVFGDRYT YKEKGGTANM AKALAQGMVL VMSLWDDHYS NMLWLDSTYP TDKNPDTDLG SGRGSCDVKS GAPADVESKS PDATVIYSNI KFGPLNSTY 169870197 Coprinopsis cinerea MLGKIAIASL SFLAIAKGQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY okayama TGNSWNSSVC SDGTTCAQRC ALEGANYQQT YGITTSGNSL TMKFLTRSQG TNVGGRVYLM ENENRYQMFN LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGMSSQPNN RAGAKYGTGY CDSQCPRDIK FIDGVANSVG WEPSETDSNA GRGRYGICCA EMDIWEANSI SNAYTPHPCR TQNDGGYQRC EGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTIDT NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITEQFCTDQ KNLFGDYSSF ARDGGLAHMG RSLAKGHVLA LSIWNDHGAH MLWLDSNYPT DADPNKPGIA RGTCPTTGGT PRETEQNHPD AQVIFSNIKF GDIGSTFSGY 3913806 Agaricus bisporus MFPRSILLAL SLTAVALGQQ VGTNMAENHP SLTWQRCTSS GCQNVNGKVT LDANWRWTHR INDFTNCYTG NEWDTSICPD GVTCAENCAL DGADYAGTYG VTSSGTALTL KFVTESQQKN IGSRLYLMAD DSNYEIFNLL NKEFTFDVDV SKLPCGLNGA LYFSEMAADG GMSSTNTAGA KYGTGYCDSQ CPRDIKFIDG EANSEGWEGS PNDVNAGTGN FGACCGEMDI WEANSISSAY TPHPCREPGL QRCEGNTCSV NDRYATECDP DGCDFNSFRM GDKSFYGPGM TVDTNQPITV VTQFITDNGS DNGNLQEIRR IYVQNGQVIQ NSNVNIPGID SGNSISAEFC DQAKEAFGDE RSFQDRGGLS GMGSALDRGM VLVLSIWDDH AVNMLWLDSD YPLDASPSQP GISRGTCSRD SGKPEDVEAN AGGVQVVYSN IKFGDINSTF NNNGGGGGNP SPTTTRPNSP AQTMWGQCGG QGWTGPTACQ SPSTCHVIND FYSQCF 169611094 Phaeosphaeria MYRNLALASL SLFGAARAQQ AGTVTTETHP SLSWKTCTGT GGTSCTTKAG KITLDANWRW THVTTGYTNC nodorum SN15 YDGNSWNTTA CPDGATCTKN CAVDGADYSG TYGITTSSNS LSIKFVTKGS NSANIGSRTY LMESDTKYQM FNLIGQEFTF DVDVSKLPCG LNGALYFVEM AADGGIGKGN NKAGAKYGTG YCDSQCPHDI KFINGKANVE GWNPSDADPN AGSGKIGACC PEMDIWEANS ISTAYTPHPC KGTGLQECTD DVSCGDGSNR YSGLCDKDGC DFNSYRMGVK DFYGPGATLD TTKKMTVVTQ FLGSGSTLSE IKRFYVQNGK VFKNSDSAIE GVTGNSITES FCAAQKTAFG DTNSFKTLGG LNEMGASLAR GHVLVMSLWD DHAVNMLWLD STYPTNSTKL GAQRGTCAID SGKPEDVEKN HPDATVVFSD IKFGPIGSTF QQPS 3131 Phanerochaete MVDIQIATFL LLGVVGVAAQ QVGTYIPENH PLLATQSCTA SGGCTTSSSK IVLDANRRWI HSTLGTTSCL chrysosporium TANGWDPTLC PDGITCANYC ALDGVSYSST YGITTSGSAL RLQFVTGTNI GSRVFLMADD THYRTFQLLN QELAFDVDVS KLPCGLNGAL YFVAMDADGG KSKYPGNRAG AKYGTGYCDS QCPRDVQFIN GQANVQGWNA TSATTGTGSY GSCCTELDIW EANSNAAALT PHTCTNNAQT RCSGSNCTSN TGFCDADGCD FNSFRLGNTT FLGAGMSVDT TKTFTVVTQF ITSDNTSTGN LTEIRRFYVQ NGNVIPNSVV NVTGIGAVNS ITDPFCSQQK KAFIETNYFA QHGGLAQLGQ ALRTGMVLAF SISDDPANHM LWLDSNFPPS ANPAVPGVAR GMCSITSGNP ADVGILNPSP YVSFLNIKFG SIGTTFRPA 70991503 Aspergillus fumigatus MHQRALLFSA LAVAANAQQV GTQTPETHPP LTWQKCTAAG SCSQQSGSVV IDANWRWLHS TKDTTNCYTG Af293 NTWNTELCPD NESCAQNCAL DGADYAGTYG VTTSGSELKL SFVTGANVGS RLYLMQDDET YQHFNLLNHE FTFDVDVSNL PCGLNGALYF VAMDADGGMS KYPSNKAGAK YGTGYCDSQC PRDLKFINGM ANVEGWEPSS SDKNAGVGGH GSCCPEMDIW EANSISTAVT PHPCDDVSQT MCSGDACGGT YSESRYAGTC DPDGCDFNPF RMGNESFYGP GKIVDTKSKM TVVTQFITAD GTDSGALSEI KRLYVQNGKV IANSVSNVAG VSGNSITSDF CTAQKKAFGD EDIFAKHGGL SGMGKALSEM VLIMSIWDDH HSSMMWLDST YPTDADPSKP GVARGTCEHG AGDPENVESQ HPDASVTFSN IKFGPIGSTY EG 294196 Phanerochaete MFRTATLLAF TMAAMVFGQQ VGTNTAENHR TLTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT chrysosporium GNQWDATLCP DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN GKVIQNSSVK IPGIDPVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG IGYTGSTTCA SPYTCHVLNP YYSQCY 18997123 Thermoascus MYQRALLFSF FLAAARAHEA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG aurantiacus NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA LYFVAMDADG NLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSSCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF NPYQPGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW LDSTYPTDAD PDTPGVARGT CPTTSGVPAD VESQNPNSYV IYSNIKVGPI NSTFTAN 4204214 Humicola grisea var MQIKSYIQYL AAALPLLSSV AAQQAGTITA ENHPRMTWKR CSGPGNCQTV QGEVVIDANW RWLHNNGQNC thermoidea YEGNKWTSQC SSATDCAQRC ALDGANYQST YGASTSGDSL TLKFVTKHEY GTNIGSRFYL MANQNKYQMF TLMNNEFAFD VDLSKVECGI NSALYFVAME EDGGMASYPS NRAGAKYGTG YCDAQCARDL KFIGGKANIE GWRPSTNDPN AGVGPMGACC AEIDVWESNA YAYAFTPHAC GSKNRYHICE TNNCGGTYSD DRFAGYCDAN GCDYNPYRMG NKDFYGKGKT VDTNRKFTVV SRFERNRLSQ FFVQDGRKIE VPPPTWPGLP NSADITPELC DAQFRVFDDR NRFAETGGFD ALNEALTIPM VLVMSIWDDH HSNMLWLDSS YPPEKAGLPG GDRGPCPTTS GVPAEVEAQY PDAQVVWSNI RFGPIGSTVN V 34582632 Trichoderma viride MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG (also known as NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL Hypochrea rufa) GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW DPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGSYSG NGLNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF GPIGSTGDPS GGNPPGGNPP GTTTTRRPAT TTGSSPGPTQ SHYGQCGGIG YSGPTVCASG TTCQVLNPYY SQCL 156712284 Thermoascus MYQRALLFSF FLAAARAQQA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG aurantiacus NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA LYFVAMDADG GLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSCCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF NPYRQGNHSF YGPGQIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW LDSTYPTDAD PDTPGVARGT CPTTSGVPAD VESQYPNSYV IYSNIKVGPI NSTFTAN 39977899 Magnaporthe grisea MIRKITTLAA LVGVVRGQAA CSLTAETHPS LTWQKCSSGG SCTNVAGSVT IDANWRWTHT TSGYTNCYTG (oryzae) 70-15 NKWDTSICST NADCASKCCV DGANYQQTYG ASTSGNALSL QYVTQSSGKN VGSRLYLLES ENKYQMFNLL GNEFTFDVDA SKLGCGLNGA VYFVSMDADG GQSKYSGNKA GAKYGTGYCD SQCPRDLKYI NGAANVEGWQ PSSGDANSGV GNMGSCCAEM DIWEANSIST AYTPHPCSNN AQHSCKGDDC GGTYSSVRYA GDCDPDGCDF NSYRQGNRTF YGPGSNFNVD SSKKVTVVTQ FISSGGQLTD IKRFYVQNGK VIPNSQSTIT GVTGNSVTQD YCDKQKTAFG DQNVFNQRGG LRQMGDALAK GMVLVMSVWD DHHSQMLWLD STYPTTSTAP GAARGSCSTS SGKPSDVQSQ TPGATVVYSN IKFGPIGSTF KSS 20986705 Talaromyces emersonii MLRRALLLSS SAILAVKAQQ AGTATAENHP PLTWQECTAP GSCTTQNGAV VLDANWRWVH DVNGYTNCYT GNTWDPTYCP DDETCAQNCA LDGADYEGTY GVTSSGSSLK LNFVTGSNVG SRLYLLQDDS TYQIFKLLNR EFSFDVDVSN LPCGLNGALY FVAMDADGGV SKYPNNKAGA KYGTGYCDSQ CPRDLKFIDG EANVEGWQPS SNNANTGIGD HGSCCAEMDV WEANSISNAV TPHPCDTPGQ TMCSGDDCGG TYSNDRYAGT CDPDGCDFNP YRMGNTSFYG PGKIIDTTKP FTVVTQFLTD DGTDTGTLSE IKRFYIQNSN VIPQPNSDIS GVTGNSITTE FCTAQKQAFG DTDDFSQHGG LAKMGAAMQQ GMVLVMSLWD DYAAQMLWLD SDYPTDADPT TPGIARGTCP TDSGVPSDVE SQSPNSYVTY SNIKFGPINS TFTAS 22138843 Aspergillus oryzae MHQRALLFSA FWTAVQAQQA GTLTAETHPS LTWQKCAAGG TCTEQKGSVV LDSNWRWLHS VDGSTNCYTG NTWDATLCPD NESCASNCAL DGADYEGTYG VTTSGDALTL QFVTGANIGS RLYLMADDDE SYQTFNLLNN EFTFDVDASK LPCGLNGAVY FVSMDADGGV AKYSTNKAGA KYGTGYCDSQ CPRDLKFING QVRKGWEPSD SDKNAGVGGH GSCCPQMDIW EANSISTAYT PHPCDDTAQT MCEGDTCGGT YSSERYAGTC DPDGCDFNAY RMGNESFYGP SKLVDSSSPV TVVTQFITAD GTDSGALSEI KRFYVQGGKV IANAASNVDG VTGNSITADF CTAQKKAFGD DDIFAQHGGL QGMGNALSSM VLTLSIWDDH HSSMMWLDSS YPEDADATAP GVARGTCEPH AGDPEKVESQ SGSATVTYSN IKYGPIGSTF DAPA 55775695 Penicillium MASTLSFKIY KNALLLAAFL GAAQAQQVGT STAEVHPSLT WQKCTAGGSC TSQSGKVVID SNWRWVHNTG chrysogenum GYTNCYTGND WDRTLCPDDV TCATNCALDG ADYKGTYGVT ASGSSLRLNF VTQASQKNIG SRLYLMADDS KYEMFQLLNQ EFTFDVDVSN LPCGLNGALY FVAMDEDGGM ARYPTNKAGA KYGTGYCDAQ CPRDLKFING QANVEGWEPS SSDVNGGTGN YGSCCAEMDI WEANSISTAF TPHPCDDPAQ TRCTGDSCGG TYSSDRYGGT CDPDGCDFNP YRMGNQSFYG PSKIVDTESP FTVVTQFITN DGTSTGTLSE IKRFYVQNGK VIPQSVSTIS AVTGNSITDS FCSAQKTAFK DTDVFAKHGG MAGMGAGLAE GMVLVMSLWD DHAANMLWLD STYPTSASST TPGAARGSCD ISSGEPSDVE ANHSNAYVVY SNIKVGPLGS TFGSTDSGSG TTTTKVTTTT ATKTTTTTGP STTGAAHYAQ CGGQNWTGPT TCASPYTCQR QGDYYSQCL 171676762 Podospora anserina MVSAKFAALA ALVASASAQQ VCSLTPESHP PLTWQRCSAG GSCTNVAGSV TLDSNWRWTH TLQGSTNCYS GNEWDTSICT TGTKCAQNCC VEGAEYAATY GITTSGNQLN LKFVTEGKYS TNVGSRTYLM ENATKYQGFN LLGNEFTFDV DVSNIGCGLN GALYFVSMDL DGGLAKYSGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WNPSTNDVNA GAGRYGTCCS EMDIWEANNM ATAYTPHSCT ILDQSRCEGE SCGGTYSSDR YGGVCDPDGC DFNSYRMGNK EFYGKGKTVD TTKKMTVVTQ FLKNAAGELS EIKRFYVQNG VVIPNSVSSI PGVPNQNSIT QDWCDAQKIA FGDPDDNTAK GGLRQMGLAL DKPMVLVMSI WNDHAAHMLW LDSTYPVDAA GRPGAERGAC PTTSGVPSEV EAEAPNSNVA FSNIKFGPIG STFNSGSTNP NPISSSTATT PTSTRVSSTS TAAQTPTSAP GGTVPRWGQC GGQGYTGPTQ CVAPYTCVVS NQWYSQCL 146350520 Pleurotus sp Florida MFPYIALVSF SFLSVVLAQQ VGTLTAETHP QLTVQQCTRG GSCTTQQRSV VLDGNWRWLH STSGSNNCYT GNTWDTSLCP DAATCSRNCA LDGADYSGTY GITSSGNALT LKFVTHGPYS TNIGSRVYLL ADDSHYQMFN LKNKEFTFDV DVSQLPCGLN GALYFSQMDA DGGTGRFPNN KAGAKYGTGY CDSQCPHDIK FINGEANVQG WQPSPNDSNA GKGQYGSCCA EMDIWEANSM ASAYTPHPCT VTTPTRCQGN DCGDGDNRYG GVCDKDGCDF NSFRMGDKNF LGPGKTVNTN SKFTVVTQFL TSDNTTSGTL SEIRRLYVQN GRVIQNSKVN IPGMASTLDS ITESFCSTQK TVFGDTNSFA SKGGLRAMGN AFDKGMVLVL SIWDDHEAKM LWLDSNYPLD KSASAPGVAR GTCATTSGEP KDVESQSPNA QVIFSNIKYG DIGSTYSN 37732123 Gibberella zeae myraiatasa LIAAVRAQQV CSLTQESKPS LNWSKCTSSG CSNVKGSVTI DANWRWTHQV SGSTNCYTGN KWDTSVCTSG KVCAERCCLD GADYASTYGI TSSGDQLSLS FVTKGPYSTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSDSDVNGGI GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGNSLTA DFCTKQKKVF NDPDDFTKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKSDGTTPTN PTNPSEPSNT ANPNPGTVDQ WGQCGGSNYS GPTACKSGFT CKKINDFYSQ CQ 156055188 Sclerotinia MYSAAVLATF SFLLGAGAQQ VGTLKTESHP PLTIQKCAAG GTCTDEADSV VLDANWRWLH STSGSTNCYT sclerotiorum 1980 GNTWDTTLCP DAATCTANCA FDGADYEGTY GITSSGDSLK LSFVTGSNVG SRTYLMDSET TYKEFALLGN EFTFTVDVSK LPCGLNGALY FVPMDADGGM SKYPTNKAGA KYGTGYCDAQ CPQDMKFVSG GANNEGWVPD SNSANSGTGN IGSCCSEFDV WEANSMSQAL TPHTCTVDGQ TACTGDDCAG NTGVCDADGC DFNPYRMGNT TFYGSGKTID TTKPFSVVTQ FITDDGTETG TLTEIKRFYV QDDVVYEQPN SDISGVSGNS ITDDFCTAQK TAFGDTDYFS QKGGMAAMGK KMADGMVLVL SIWDDYNVNM LWLDSDYPTT KDASTPGVSR GSCATTSGVP ATVEAASGSA YVTFSSIKYG PIGSTFKAPA DSSSPVVASS SPAAVAAVVS TSSAQAVPSH PAVSSSQAAV STPEAVSSAP EVPASSSAAQ SVAPTSTKPK CSKVSQSSTL ATSVAAPATT ATSAAVAATS AASSSGSVPL YGNCTGGKTC SEGTCVVQNP WYSQCVASS 453224 Phanerochaete MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT chrysosporium GNEWDTSLCP DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDGDGCDF NSFRMGDKTF LGKGMTVDTS KPFTVVTQFL TNDNTSTGTL SEIRRIYIQN GKVIQNSVAN IPGVDPVNSI TDNFCAQQKT AFGDTNWFAQ KGGLKQMGEA LGNGMVLALS IWDDHAANML WLDSDYPTDK DPSAPGVARG TCATTSGVPS DVESQVPNSQ VVFSNIKFGD IGSTFSGTSS PNPPGGSTTS SPVTTSPTPP PTGPTVPQWG QCGGIGYSGS TTCASPYTCH VLNPYYSQCY 50402144 Trichoderma reesei MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGSYSG NELNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF GPIGSTGNPS GGNPPGGNRG TTTTRRPATT TGSSPGPTQS HYGQCGGIGY SGPTVCASGT TCQVLNPYYS QCL 115397177 Aspergillus terreus MPSTYDIYKK LLLLASFLSA SQAQQVGTSK AEVHPSLTWQ TCTSGGSCTT VNGKVVVDAN WRWVHNVDGY NIH2624 NNCYTGNTWD TTLCPDDETC ASNCALEGAD YSGTYGVTTS GNSLRLNFVT QASQKNIGSR LYLMEDDSTY KMFKLLNQEF TFDVDVSNLP CGLNGAVYFV SMDADGGMAK YPANKAGAKY GTGYCDSQCP RDLKFINGMA NVEGWEPSAN DANAGTGNHG SCCAEMDIWE ANSISTAYTP HPCDTPGQVM CTGDSCGGTY SSDRYGGTCD PDGCDFNSYR QGNKTFYGPG MTVDTKSKIT VVTQFLTNDG TASGTLSEIK RFYVQNGKVI PNSESTWSGV SGNSITTAYC NAQKTLFGDT DVFTKHGGME GMGAALAEGM VLVLSLWDDH NSNMLWLDSN YPTDKPSTTP GVARGSCDIS SGDPKDVEAN DANAYVVYSN IKVGPIGSTF SGSTGGGSSS STTATSKTTT TSATKTTTTT TKTTTTTSAS STSTGGAQHW AQCGGIGWTG PTTCVAPYTC QKQNDYYSQC L 154312003 Botryotinia fuckeliana MISKVLAFTS LLAAARAQQA GTLTTETHPP LSVSQCTASG CTTSAQSIVV DANWRWLHST TGSTNCYTGN B05-10 TWDKTLCPDG ATCAANCALD GADYSGVYGI TTSGNSIKLN FVTKGANTNV GSRTYLMAAG STTQYQMLKL LNQEFTFDVD VSNLPCGLNG ALYFAAMDAD GGLSRFPTNK AGAKYGTGYC DAQCPQDIKF INGVANSVGW TPSSNDVNAG AGQYGSCCSE MDIWEANKIS AAYTPHPCSV DTQTRCTGTD CGIGARYSSL CDADGCDFNS YRQGNTSFYG AGLTVNTNKV FTVVTQFITN DGTASGTLKE IRRFYVQNGV VIPNSQSTIA GVPGNSITDS FCAAQKTAFG DTNEFATKGG LATMSKALAK GMVLVMSIWD DHTANMLWLD APYPATKSPS APGVTRGSCS ATSGNPVDVE ANSPGSSVTF SNIKWGPINS TYTGSGAAPS VPGTTTVSSA PASTATSGAG GVAKYAQCGG SGYSGATACV SGSTCVALNP YYSQCQ 49333365 Volvariella volvacea MFPAATLFAF SLFAAVYGQQ VGTQLAETHP RLTWQKCTRS GGCQTQSNGA IVLDANWRWV HNVGGYTNCY TGNTWNTSLC PDGATCAKNC ALDGANYQST YGITTSGNAL TLKFVTQSEQ KNIGSRVYLL ESDTKYQLFN PLNQEFTFDV DVSQLPCGLN GAVYFSAMDA DGGMSKFPNN AAGAKYGTGY CDSQCPRDIK FINGEANVQG WQPSPNDTNA GTGNYGACCN EMDVWEANSI STAYTPHPCT QQGLVRCSGT ACGGGSNRYG SICDPDGCDF NSFRMGDKSF YGPGLTVNTQ QKFTVVTQFL TNNNSSSGTL REIRRLYVQN GRVIQNSKVN IPGMPSTMDS VTTEFCNAQK TAFNDTFSFQ QKGGMANMSE ALRRGMVLVL SIWDDHAANM LWLDSNYPTD RPASQPGVAR GTCPTSSGKP SDVENSTANS QVIYSNIKFG DIGSTYSA 729650 Penicillium MKGSISYQIY KGALLLSALL NSVSAQQVGT LTAETHPALT WSKCTAGXCS QVSGSVVIDA NWPXVHSTSG janthinellum STNCYTGNTW DATLCPDDVT CAANCAVDGA RRQHLRVTTS GNSLRINFVT TASQKNIGSR LYLLENDTTY QKFNLLNQEF TFDVDVSNLP CGLNGALYFV DMDADGGMAK YPTNKAGAKY GTGYCDSQCP RDLKFINGQA NVDGWTPSKN DVNSGIGNHG SCCAEMDIWE ANSISNAVTP HPCDTPSQTM CTGQRCGGTY STDRYGGTCD PDGCDFNPYR MGVTNFYGPG ETIDTKSPFT VVTQFLTNDG TSTGTLSEIK RFYVQGGKVI GNPQSTIVGV SGNSITDSWC NAQKSAFGDT NEFSKHGGMA GMGAGLADGM VLVMSLWDDH ASDMLWLDST YPTNATSTTP GAKRGTCDIS RRPNTVESTY PNAYVIYSNI KTGPLNSTFT GGTTSSSSTT TTTSKSTSTS SSSKTTTTVT TTTTSSGSSG TGARDWAQCG GNGWTGPTTC VSPYTCTKQN DWYSQCL 146424871 Pleurotus sp Florida MFRTAALTAF TLAAVVLGQQ VGTLTAENHP ALSIQQCTAS GCTTQQKSVV LDSNWRWTHS LPVHTNCYTG NAWDASLCPD PTTCATNCAI DGADYSGTYG ITTSGNALTL RFVTNGPYSK NIGSRVYLLD DADHYKMFDL KNQEFTFDVD MSGLPCGLNG ALYFSEMPAD GGKAAHTSNK AGAKYGTGYC DAQCPHDIKW INGEANILDW SASATDANAG NGRYGACCAE MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGDNRYGG VCDKDGCDFN SYRMGDKNFL GRGKTIDTTK KITVVTQFIT DDNTSSGNLV EIRRVYVQDG VTYQNSFSTF PSLSQYNSIS DDFCVAQKTL FGDNQYYNTH GGTEKMGDAM ANGMVLIMSL WSDHAAHMLW LDSDYPLDKS PSEPGVSRGA CATTTGDPDD VVANHPNASV TFSNIKYGPI GSTYGGSTPP VSSGNTSAPP VTSTTSSGPT TPTGPTGTVP KWGQCGGNGY SGPTTCVAGS TCTYSNDWYS QCL 67538012 Aspergillus nidulans MYQRALLFSA LLSVSRAQQA GTAQEEVHPS LTWQRCEASG SCTEVAGSVV LDSNWRWTHS VDGYTNCYTG FGSC A4 NEWDATLCPD NESCAQNCAV DGADYEATYG ITSNGDSLTL KFVTGSNVGS RVYLMEDDET YQMFDLLNNE FTFDVDVSNL PCGLNGALYF TSMDADGGLS KYEGNTAGAK YGTGYCDSQC PRDIKFINGL GNVEGWEPSD SDANAGVGGM GTCCPEMDIW EANSISTAYT PHPCDSVEQT MCEGDSCGGT YSDDRYGGTC DPDGCDFNSY RMGNTSFYGP GAIIDTSSKF TVVTQFIADG GSLSEIKRFY VQNGEVIPNS ESNISGVEGN SITSEFCTAQ KTAFGDEDIF AQHGGLSAMG DAASAMVLIL SIWDDHHSSM MWLDSSYPTD ADPSQPGVAR GTCEQGAGDP DVVESEHADA SVTFSNIKFG PIGSTF 62006162 Fusarium poae MYRAIATASA LIAAVRAQQV CSLTTETKPA LTWSKCTSSG CSNVQGSVTI DANWRWTHQV SGSTNCHTGN KWDTSVCTSG KVCAEKCCVD GADYASTYGI TSSGNQLSLS FVTKGSYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWE PSKSDVNGGI GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHACTGDSC GGTYSNDRYG GTCDADGCDF NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGNPGSSLTS DFCTTQKKVF GDIDDFAKKG AWNGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTA LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YNKEGTQPQP TNPTNPNPTN PTNPGTVDQW GQCGGTNYSG PTACKSPFTC KKINDFYSQC Q 146424873 Pleurotus sp Florida MFRTAALTAF TLAAVVLGQQ VGTLAAENHP ALSIQQCTAS GCTTQQKSVV LDSNWRWTHS TAGATNCYTG NAWDSSLCPN PTTCATNCAI DGADYSGTYG ITTSGNSLTL RFVTNGQYSE NIGSRVYLLD DADHYKLFNL KNQEFTFDVD MSGLPCGLNG ALYFSEMAAD GGKAAHTGNN AGAKYGTGYC DAQCPHDIKW INGEANILDW SGSATDPNAG NGRYGACCAE MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGDNRYGG VCDKDGCDFN SYRMGDKNFL GRGKTIDTTK KITVVTQFIT DDNTPTGNLV EIRRVYVQDG VTYQNSFSTF PSLSQYNSIS DDFCVAQKTL FGDNQYYNTH GGTEKMGDSL ANGMVLIMSL WSDHAAHMLW LDSDYPLDKS PSEPGVSRGA CATTTGDPDD VVANHPNASV TFSNIKYGPI GSTYGGSTPP VSSGNTSVPP VTSTTSSGPT TPTGPTGTVP KWGQCGGIGY SGPTSCVAGS TCTYSNEWYS QCL 295937 Trichoderma viride MYQKLALISA FLATARAQSA CTLQAETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSADSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVTKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDSC GGTYSGDRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGDYSG NSLDDDYCAA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT DETSSTPGAV RGSSSTSSGV PAQLESNSPN AKVVYSNIKF GPIGSTGNPS GGNPPGGNPP GTTTPRPATS TGSSPGPTQT HYGQCGGIGY IGPTVCASGS TCQVLNPYYS QCL 6179889 # Alternaria alternata MTWQSCTAKG SCTNKNGKIV IDANWRWLHK KEGYDNCYTG NEWDATACPD NKACAANCAV DGADYSGTYG ITAGSNSLKL KFITKGSYST NIGSRTYLMK DDTTYEMFKF TGNQEFTFDV DVSNLPCGFN GALYFVSMDA DGGLKKYSTN KAGAKYGTGY CDAQCPRDLK FINGEGNVEG WKPSSNDANA GVGGHGSCCA EMDIWEANSV STAVTPHSCS TIEQSRCDGD GCGGTYSADR YAGVCDPDGC DFNSYRMGVK DFYGKGKTVD TSKKFTVVTQ FIGTGDAMEI KRFYVQNGKT IAQPASAVPG VEGNSITTKF CDQQKAVFGD TYTFKDKGGM ANMAKALANG MVLVMSLWDD HYSNMLWLDS TYPTDKNPDT DLGTGRGECE TSSGVPADVE SQHADATVVY SNIKFGPLNS TFG 119483864 Neosartorya fischeri MASAISFQVY RSALILSAFL PSITQAQQIG TYTTETHPSM TWETCTSGGS CATNQGSVVM DANWRWVHQV NRRL 181 GSTTNCYTGN TWDTSICDTD ETCATECAVD GADYESTYGV TTSGSQIRLN FVTQNSNGAN VGSRLYMMAD NTHYQMFKLL NQEFTFDVDV SNLPCGLNGA LYFVTMDEDG GVSKYPNNKA GAQYGVGYCD SQCPRDLKFI QGQANVEGWT PSSNNENTGL GNYGSCCAEL DIWESNSISQ ALTPHPCDTA TNTMCTGDAC GGTYSSDRYA GTCDPDGCDF NPYRMGNTTF YGPGKTIDTN SPFTVVTQFI TDDGTDTGTL SEIRRYYVQN GVTYAQPDSD ISGITGNAIN ADYCTAENTV FDGPGTFAKH GGFSAMSEAM STGMVLVMSL WDDYYADMLW LDSTYPTNAS SSTPGAVRGS CSTDSGVPAT IESESPDSYV TYSNIKVGPI GSTFSSGSGS GSSGSGSSGS ASTSTTSTKT TAATSTSTAV AQHYSQCGGQ DWTGPTTCVS PYTCQVQNAY YSQCL 85083281 Neurospora crassa MKAYFEYLVA ALPLLGLATA QQVGKQTTET HPKLSWKKCT GKANCNTVNA EVVIDSNWRW LHDSSGKNCY OR74A DGNKWTSACS SATDCASKCQ LDGANYGTTY GASTSGDALT LKFVTKHEYG TNIGSRFYLM NGASKYQMFT LMNNEFAFDV DLSTVECGLN AALYFVAMEE DGGMASYSSN KAGAKYGTGY CDAQCARDLK FVGGKANIEG WTPSTNDANA GVGPYGGCCA EIDVWESNAH SFAFTPHACK TNKYHVCERD NCGGTYSEDR FAGLCDANGC DYNPYRMGNT DFYGKGKTVD TSKKFTVVSR FEENKLTQFF VQNGQKIEIP GPKWDGIPSD NANITPEFCS AQFQAFGDRD RFAEVGGFAQ LNSALRMPMV LVMSIWDDHY ANMLWLDSVY PPEKEGQPGA ARGDCPQSSG VPAEVESQYA NSKVVYSNIR FGPVGSTVNV 3913803 Cryphonectria MFSKFALTGS LLAGAVNAQG VGTQQTETHP QMTWQSCTSP SSCTTNQGEV VIDSNWRWVH DKDGYVNCYT parasitica GNTWNTTLCP DDKTCAANCV LDGADYSSTY GITTSGNALS LQFVTQSSGK NIGSRTYLME SSTKYHLFDL IGNEFAFDVD LSKLPCGLNG ALYFVTMDAD GGMAKYSTNT AGAEYGTGYC DSQCPRDLKF INGQGNVEGW TPSTNDANAG VGGLGSCCSE MDVWEANSMD MAYTPHPCET AAQHSCNADE CGGTYSSSRY AGDCDPDGCD WNPFRMGNKD FYGSGDTVDT SQKFTVVTQF HGSGSSLTEI SQYYIQGGTK IQQPNSTWPT LTGYNSITDD FCKAQKVEFN DTDVFSEKGG LAQMGAGMAD GMVLVMSLWD DHYANMLWLD STYPVDADAS SPGKQRGTCA TTSGVPADVE SSDASATVIY SNIKFGPIGA TY 60729633 Corticium rolfsii MFPAAALLSF TLLAVASAQQ IGTNTAEVHP SLTVSQCTTS GGCTSSTQSI VLDANWRWLH STSGYTNCYT GNQWNSDLCP DPDTCATNCA LDGASYESTY GISTDGNAVT LNFVTQGSQT NVGSRVYLLS DDTHYQTFSL LNKEFSFDVD ASNIGCGING AVYFVQMDAD GGLSKYSSNK AGAQYGTGYC DSQCPQDIKF INGEANLLDW NATSANSGTG SYGSCCPEMD IWEANKYAAA YTPHPCSVSG QTRCTGTSCG AGSERYDGYC DKDGCDFNSW RMGNETFLGP GMTIDTNKKF TIVTQFITDD NTANGTLSEI RRLYVQGGTV IQNSVANQPN IPKVNSITDS FCTAQKTEFG DQDYFGTIGG LSQMGKAMSD MVLVMSIWDD YDAEMLWLDS NYPTSGSAST PGISRGPCSA TSGLPATVES QQASASVTYS NIKWGDIGST YSGSGSSGSS SSSSSSAASA STSTHTSAAA TATSSAAAAT GSPVPAYGQC GGQSYTGSTT CASPYVCKVS NAYYSQCLPA 39971383 Magnaporthe grisea MKRALCASLS LLAAAVAQQV GTNEPEVHPK MTWKKCSSGG SCSTVNGEVV IDGNWRWIHN IGGYENCYSG 70-15 NKWTSVCSTN ADCATKCAME GAKYQETYGV STSGDALTLK FVQQNSSGKN VGSRMYLMNG ANKYQMFTLK NNEFAFDVDL SSVECGMNSA LYFVPMKEDG GMSTEPNNKA GAKYGTGYCD AQCARDLKFI GGKGNIEGWQ PSSTDSSAGI GAQGACCAEI DIWESNKNAF AFTPHPCENN EYHVCTEPNC GGTYADDRYG GGCDANGCDY NPYRMGNPDF YGPGKTIDTN RKFTVISRFE NNRNYQILMQ DGVAHRIPGP KFDGLEGETG ELNEQFCTDQ FTVFDERNRF NEVGGWSKLN AAYEIPMVLV MSIWSDHFAN MLWLDSTYPP EKAGQPGSAR GPCPADGGDP NGVVNQYPNA KVIWSNVRFG PIGSTYQVD 39973029 Magnaporthe grisea MQLTKAGVFL GALMGGAAAQ QVGTQTAENH PKMTWKKCTG KASCTTVNGE VVIDANWRWL HDASSKNCYD 70-15 GNRWTDSCRT ASDCAAKCSL EGADYAKTYG ASTSGDALSL KFVTRHDYGT NIGSRFYLMN GASKYQMFSL LGNEFAFDVD LSTIECGLNS ALYFVAMEED GGMKSYSSNK AGAKYGTGYC DAQCARDLKF VGGKANIEGW KPSSNDANAG VGPYGACCAE IDVWESNAHA FAFTPHPCTD NKYHVCQDSN CGGTYSDDRF AGKCDANGCD INPYRLGNTD FYGKGKTVDT SKKFTVVTRF ERDALTQFFV QNNKRIDMPS PALEGLPATG AITAEYCTNV FNVFGDRNRF DEVGGWSQLQ QALSLPMVLV MSIWDDHYSN MLWLDSVYPP DKEGSPGAAR GDCPQDSGVP SEVESQIPGA TVVWSNIRFG PVGSTVNV 1170141 Fusarium oxysporum MYRIVATASA LIAAARAQQV CSLNTETKPA LTWSKCTSSG CSDVKGSVVI DANWRWTHQT SGSTNCYTGN KWDTSICTDG KTCAEKCCLD GADYSGTYGI TSSGNQLSLG FVTNGPYSKN IGSRTYLMEN ENTYQMFQLL GNEFTFDVDV SGIGCGLNGA PHFVSMDEDG GKAKYSGNKA GAKYGTGYCD AQCPRDVKFI NGVANSEGWK PSDSDVNAGV GNLGTCCPEM DIWEANSIST AFTPHPCTKL TQHSCTGDSC GGTYSSDRYG GTCDADGCDF NAYRQGNKTF YGPGSNFNID TTKKMTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGNPGSSLTS DFCSKQKSVF GDIDDFSKKG GWNGMSDALS APMVLVMSLW HDHHSNMLWL DSTYPTDSTK VGSQRGSCAT TSGKPSDLER DVPNSKVSFS NIKFGPIGST YKSDGTTPNP PASSSTTGSS TPTNPPAGSV DQWGQCGGQN YSGPTTCKSP FTCKKINDFY SQCQ 121710012 Aspergillus clavatus MYQRALLFSA LATAVSAQQV GTQKAEVHPA LTWQKCTAAG SCTDQKGSVV IDANWRWLHS TEDTTNCYTG NRRL 1 NEWNAELCPD NEACAKNCAL DGADYSGTYG VTADGSSLKL NFVTSANVGS RLYLMEDDET YQMFNLLNNE FTFDVDVSNL PCGLNGALYF VSMDADGGLS KYPGNKAGAK YGTGYCDSQC PRDLKFINGE ANVEGWKPSD NDKNAGVGGY GSCCPEMDIW EANSISTAYT PHPCDGMEQT RCDGNDCGGT YSSTRYAGTC DPDGCDFNSF RMGNESFYGP GGLVDTKSPI TVVTQFVTAG GTDSGALKEI RRVYVQGGKV IGNSASNVAG VEGDSITSDF CTAQKKAFGD EDIFSKHGGL EGMGKALNKM ALIVSIWDDH ASSMMWLDST YPVDADASTP GVARGTCEHG LGDPETVESQ HPDASVTFSN IKFGPIGSTY KSV 17902580 Penicillium MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN funiculosum TSTNCYTGNT WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSNLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSTNN SNTGIGNHGS CCAELDIWEA NSISEALTPH PCDTPGLTVC TADDCGGTYS SNRYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPFTV VTQFVTDDGT SSGSLSEIRR YYVQNGVVIP QPSSKISGIS GNVINSDFCA AELSAFGETA SFTNHGGLKN MGSALEAGMV LVMSLWDDYS VNMLWLDSTY PANETGTPGA ARGSCPTTSG NPKTVESQSG SSYVVFSDIK VGPFNSTFSG GTSTGGSTTT TASGTTSTKA STTSTSSTST GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL 1346226 Humicola grisea var MRTAKFATLA ALVASAAAQQ ACSLTTERHP SLSWNKCTAG GQCQTVQASI TLDSNWRWTH QVSGSTNCYT thermoidea GNKWDTSICT DAKSCAQNCC VDGADYTSTY GITTNGDSLS LKFVTKGQHS TNVGSRTYLM DGEDKYQTFE LLGNEFTFDV DVSNIGCGLN GALYFVSMDA DGGLSRYPGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WTGSTNDPNA GAGRYGTCCS EMDIWEANNM ATAFTPHPCT IIGQSRCEGD SCGGTYSNER YAGVCDPDGC DFNSYRQGNK TFYGKGMTVD TTKKITVVTQ FLKDANGDLG EIKRFYVQDG KIIPNSESTI PGVEGNSITQ DWCDRQKVAF GDIDDFNRKG GMKQMGKALA GPMVLVMSIW DDHASNMLWL DSTFPVDAAG KPGAERGACP TTSGVPAEVE AEAPNSNVVF SNIRFGPIGS TVAGLPGAGN GGNNGGNPPP PTTTTSSAPA TTTTASAGPK AGRWQQCGGI GFTGPTQCEE PYICTKLNDW YSQCL 156712282 Chaetomium MMYKKFAALA ALVAGASAQQ ACSLTAENHP SLTWKRCTSG GSCSTVNGAV TIDANWRWTH TVSGSTNCYT thermophilum GNQWDTSLCT DGKSCAQTCC VDGADYSSTY GITTSGDSLN LKFVTKHQYG TNVGSRVYLM ENDTKYQMFE LLGNEFTFDV DVSNLGCGLN GALYFVSMDA DGGMSKYSGN KAGAKYGTGY CDAQCPRDLK FINGEANVGN WTPSTNDANA GFGRYGSCCS EMDVWEANNM ATAFTPHPCT TVGQSRCEAD TCGGTYSSDR YAGVCDPDGC DFNAYRQGDK TFYGKGMTVD TNKKMTVVTQ FHKNSAGVLS EIKRFYVQDG KIIANAESKI PGNPGNSITQ EYCDAQKVAF SNTDDFNRKG GMAQMSKALA GPMVLVMSVW DDHYANMLWL DSTYPIDQAG APGAERGACP TTSGVPAEIE AQVPNSNVIF SNIRFGPIGS TVPGLDGSNP GNPTTTVVPP ASTSTSRPTS STSSPVSTPT GQPGGCTTQK WGQCGGIGYT GCTNCVAGTT CTQLNPWYSQ CL 169768818 Aspergillus oryzae MASLSLSKIC RNALILSSVL STAQGQQVGT YQTETHPSMT WQTCGNGGSC STNQGSVVLD ANWRWVHQTG RIB40 SSSNCYTGNK WDTSYCSTND ACAQKCALDG ADYSNTYGIT TSGSEVRLNF VTSNSNGKNV GSRVYMMADD THYEVYKLLN QEFTFDVDVS KLPCGLNGAL YFVVMDADGG VSKYPNNKAG AKYGTGYCDS QCPRDLKFIQ GQANVEGWVS STNNANTGTG NHGSCCAELD IWESNSISQA LTPHPCDTPT NTLCTGDACG GTYSSDRYSG TCDPDGCDFN PYRVGNTTFY GPGKTIDTNK PITVVTQFIT DDGTSSGTLS EIKRFYVQDG VTYPQPSADV SGLSGNTINS EYCTAENTLF EGSGSFAKHG GLAGMGEAMS TGMVLVMSLW DDYYANMLWL DSNYPTNEST SKPGVARGTC STSSGVPSEV EASNPSAYVA YSNIKVGPIG STFKS 46241270 Gibberella pulicaris MYRAIATASA LIAAVRAQQV CSLTPETKPA LSWSKCTSSG CSNVQGSVTI DANWRWTHQL SGSTNCYTGN KWDTSICTSG KVCAEKCCID GAEYASTYGI TSSGNQLSLS FVTKGAYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSKSDVNAGI GNMGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGSSLTP EFCTAQKKVF GDTDDFAKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGAQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKEGVPEPTN PTNPTNPTNP TNPGTVDQWA QCGGTNYSGP TACKSPFTCK KINDFYSQCQ 49333363 Volvariella volvacea MFPKSSLLVL SFLATAYAQQ VGTQTAEVHP SLNWARCTSS GCTNVAGSVT LDANWRWLHT TSGYTNCYTG NSWNTTLCPD GATCAQNCAL DGANYQSTCG ITTSGNALTL KFVTQGEQKN IGSRVYLMAS ESRYEMFGLL NKEFTFDVDV SNLPCGLNGA LYFSSMDADG GMAKNPGNKA GAKYGTGYCD SQCPRDIKFI NGEANVAGWN GSPNDTNAGT GNWGACCNEM DIWEANSISA AYTPHPCTVQ GLSRCSGTAC GTNDRYGTVC DPDGCDFNSY RMGDKTYYGP GGTGVDTRSK FTVVTQFLTN NNSSSGTLSE IRRLYVQNGR VVQNSKVNIP GMSNTLDSIT TGFCDSQKTA FGDTRSFQNK GGMSAMGQAL GAGMVLVLSV WDDHAANMLW LDSNYPVDAD PSKPGIARGT CSTTSGKPTD VEQSAANSSV TFSNIKFGDI GTTYTGGSVT TTPGNPGTTT STAPGAVQTK WGQCGGQGWT GPTRCESGST CTVVNQWYSQ CI 46395332 Irpex lacteus MFRKAALLAF SFLAIAHGQQ VGTNQAENHP SLPSQHCTAS GCTTSSTSVV LDANWRWVHT TTGYTNCYTG QTWDASICPD GVTCAKACAL DGADYSGTYG ITTSGNALTL QFVKGTNVGS RVYLLQDASN YQLFKLINQE FTFDVDMSNL PCGLNGAVYL SQMDQDGGVS RFPTNTAGAK YGTGYCDSQC PRDIKFINGE ANVAGWTGSS SDPNSGTGNY GTCCSEMDIW EANSVAAAYT PHPCSVNQQT RCTGADCGQD ANRYKGVCDP DGCDFNSFRM GDQTFLGKGL TVDTSRKFTI VTQFISDDGT SSGNLAEIRR FYVQDGKVIP NSKVNIAGCD AVNSITDKFC TQQKTAFGDT NRFADQGGLK QMGAALKSGM VLALSLWDDH AANMLWLDSD YPTTADASKP GVARGTCPNT SGVPKDVESQ SGSATVTYSN IKWGDLNSTF SGTASNPTGP SSSPSGPSSS SSSTAGSQPT QPSSGSVAQW GQCGGIGYSG ATGCVSPYTC HVVNPYYSQC Y 50844407 # Chaetomium TETHPRLTWK RCTSGGNCST VNGAVTIDAN WRWTHTVSGS TNCYTGNEWD TSICSDGKSC AQTCCVDGAD thermophilum var YSSTYGITTS GDSLNLKFVT KHQHGTNVGS RVYLMENDTK YQMFELLGNE FTFDVDVSNL GCGLNGALYF thermophilum VSMDADGGMS KYSGNKAGAK YGTGYCDAQC PRDLKFINGE ANIENWTPST NDANAGFGRY GSCCSEMDIW EANNMATAFT PHPCTIIGQS RCEGNSCGGT YSSERYAGVC DPDGCDFNAY RQGDKTFYGK GMTVDTTKKM TVVTQFHKNS AGVLSEIKRF YVQDGKIIAN AESKIPGNPG NSITQEWCDA QKVAFGDIDD FNRKGGMAQM SKALEGPMVL VMSVWDDHYA NMLWLDSTYP IDKAGTPGAE RGACPTTSGV PAEIEAQVPN SNVIFSNIRF GPIGSTVPGL DGSTPSNPTA TVAPPTSTTT SVRSSTTQIS TPTSQPGGCT TQKWGQCGGI GYTGCTNCVA GTTCTELNPW YSQCL 4586347 Irpex lacteus MFHKAVLVAF SLVTIVHGQQ AGTQTAENHP QLSSQKCTAG GSCTSASTSV VLDSNWRWVH TTSGYTNCYT GNTWDASICS DPVSCAQNCA LDGADYAGTY GITTSGDALT LKFVTGSNVG SRVYLMEDET NYQMFKLMNQ EFTFDVDVSN LPCGLNGAVY FVQMDQDGGT SKFPNNKAGA KFGTGYCDSQ CPQDIKFING EANIVDWTAS AGDANSGTGS FGTCCQEMDI WEANSISAAY TPHPCTVTEQ TRCSGSDCGQ GSDRFNGICD PDGCDFNSFR MGNTEFYGKG LTVDTSQKFT IVTQFISDDG TADGNLAEIR RFYVQNGKVI PNSVVQITGI DPVNSITEDF CTQQKTVFGD TNNFAAKGGL KQMGEAVKNG MVLALSLWDD YAAQMLWLDS DYPTTADPSQ PGVARGTCPT TSGVPSQVEG QEGSSSVIYS NIKFGDLNST FTGTLTNPSS PAGPPVTSSP SEPSQSTQPS QPAQPTQPAG TAAQWAQCGG MGFTGPTVCA SPFTCHVLNP YYSQCY 3980202 Phanerochaete MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT chrysosporium GNEWNTSLCP DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDHGDGCD FNSFRMGDKT FLGKGMTVDT SKPFTDVTQF LTNDNTSTGT LSEIRRIYIQ NGKVIQNSVA NIPGVDPVNS ITDNFCAQQK TAFGDTNWFA QKGGLKQMGE ALGNGMVLAL SIWDDHAANM LWLDSDYPTD KDPSAPGVAR GTCATTSGVP SDVESQVPNS QVVFSNIKFG DIGSTFSGTS SPNPPGGSTT SSPVTTSPTP PPTGPTVPQW GQCGGIGYSG STTCASPYTC HVLNPYYSQC Y 27125837 Melanocarpus MMMKQYLQYL AAALPLVGLA AGQRAGNETP ENHPPLTWQR CTAPGNCQTV NAEVVIDANW RWLHDDNMQN albomyces CYDGNQWTNA CSTATDCAEK CMIEGAGDYL GTYGASTSGD ALTLKFVTKH EYGTNVGSRF YLMNGPDKYQ MFNLMGNELA FDVDLSTVEC GINSALYFVA MEEDGGMASY PSNQAGARYG TGYCDAQCAR DLKFVGGKAN IEGWKSSTSD PNAGVGPYGS CCAEIDVWES NAYAFAFTPH ACTTNEYHVC ETTNCGGTYS EDRFAGKCDA NGCDYNPYRM GNPDFYGKGK TLDTSRKFTV VSRFEENKLS QYFIQDGRKI EIPPPTWEGM PNSSEITPEL CSTMFDVFND RNRFEEVGGF EQLNNALRVP MVLVMSIWDD HYANMLWLDS IYPPEKEGQP GAARGDCPTD SGVPAEVEAQ FPDAQVVWSN IRFGPIGSTY DF 171696102 Podospora anserina MYRSATFLTF ASLVLGQQVG TYTAERHPSM PIQVCTAPGQ CTRESTEVVL DANWRWTHIT NGYTNCYTGN EWNATACPDG ATCAKNCAVD GADYSGTYGI TTPSSGALRL QFVKKNDNGQ NVGSRVYLMA SSDKYKLFNL LNKEFTFDVD VSKLPCGLNG AVYFSEMLED GGLKSFSGNK AGAKYGTGYC DSQCPQDIKF INGEANVEGW GGADGNSGTG KYGICCAEMD IWEANSDATA YTPHVCSVNE QTRCEGVDCG AGSDRYNSIC DKDGCDFNSY RLGNREFYGP GKTVDTTRPF TIVTQFVTDD GTDSGNLKSI HRYYVQDGNV IPNSVTEVAG VDQTNFISEG FCEQQKSAFG DNNYFGQLGG MRAMGESLKK MVLVLSIWDD HAVNMNWLDS IFPNDADPEQ PGVARGRCDP ADGVPATIEA AHPDAYVIYS NIKFGAINST FTAN 3913802 Cochliobolus MYRTLAFASL SLYGAARAQQ VGTSTAENHP KLTWQTCTGT GGTNCSNKSG SVVLDSNWRW AHNVGGYTNC carbonum YTGNSWSTQY CPDGDSCTKN CAIDGADYSG TYGITTSNNA LSLKFVTKGS FSSNIGSRTY LMETDTKYQM FNLINKEFTF DVDVSKLPCG LNGALYFVEM AADGGIGKGN NKAGAKYGTG YCDSQCPHDI KFINGKANVE GWNPSDADPN GGAGKIGACC PEMDIWEANS ISTAYTPHPC RGVGLQECSD AASCGDGSNR YDGQCDKDGC DFNSYRMGVK DFYGPGATLD TTKKMTVITQ FLGSGSSLSE IKRFYVQNGK VYKNSQSAVA GVTGNSITES FCTAQKKAFG DTSSFAALGG LNEMGASLAR GHVLIMSLWG DHAVNMLWLD STYPTDADPS KPGAARGTCP TTSGKPEDVE KNSPDATVVF SNIKFGPIGS TFAQPA 50403723 Trichoderma viride MYQKLALISA FLATARAQSA CTLQAETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSADSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICDGDSC GGTYSGDRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGDYSG NSLDDDYCAA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQLESNSPN AKVVYSNIKF GPIGSTGNSS GGNPPGGNPP GTTTTRRPAT STGSSPGPTQ THYGQCGGIG YSGPTVCASG STCQVLNPYY SQCL 3913798 Aspergillus aculeatus MVDSFSIYKT ALLLSMLATS NAQQVGTYTA ETHPSLTWQT CSGSGSCTTT SGSVVIDANW RWVHEVGGYT NCYSGNTWDS SICSTDTTCA SECALEGATY ESTYGVTTSG SSLRLNFVTT ASQKNIGSRL YLLADDSTYE TFKLFNREFT FDVDVSNLPC GLNGALYFVS MDADGGVSRF PTNKAGAKYG TGYCDSQCPR DLKFIDGQAN IEGWEPSSTD VNAGTGNHGS CCPEMDIWEA NSISSAFTAH PCDSVQQTMC TGDTCGGTYS DTTDRYSGTC DPDGCDFNPY RFGNTNFYGP GKTVDNSKPF TVVTQFITHD GTDTGTLTEI RRLYVQNGVV IGNGPSTYTA ASGNSITESF CKAEKTLFGD TNVFETHGGL SAMGDALGDG MVLVLSLWDD HAADMLWLDS DYPTTSCASS PGVARGTCPT TTGNATYVEA NYPNSYVTYS NIKFGTLNST YSGTSSGGSS SSSTTLTTKA STSTTSSKTT TTTSKTSTTS SSSTNVAQLY GQCGGQGWTG PTTCASGTCTKQNDYYSQCL 66828465 Dictyostelium MYRILKSFIL LSLVNMSLSQ KIGKLTPEVH PPMTFQKCSE GGSCETIQGE VVVDANWRWV HSAQGQNCYT discoideum GNTWNPTICP DDETCAENCY LDGANYESVY GVTTSEDSVR LNFVTQSQGK NIGSRLFLMS NESNYQLFHV LGQEFTFDVD VSNLDCGLNG ALYLVSMDSD GGSARFPTNE AGAKYGTGYC DAQCPRDLKF ISGSANVDGW IPSTNNPNTG YGNLGSCCAE MDLWEANNMA TAVTPHPCDT SSQSVCKSDS CGGAASSNRY GGICDPDGCD YNPYRMGNTS FFGPNKMIDT NSVITVVTQF ITDDGSSDGK LTSIKRLYVQ DGNVISQSVS TIDGVEGNEV NEEFCTNQKK VFGDEDSFTK HGGLAKMGEA LKDGMVLVLS LWDDYQANML WLDSSYPTTS SPTDPGVARG SCPTTSGVPS KVEQNYPNAY VVYSNIKVGP IDSTYKK 156060391 Sclerotinia MISRVLAISS LLAAARAQQI GTNTAEVHPA LTSIVIDANW RWLHTTSGYT NCYTGNSWDA TLCPDAVTCA sclerotiorum 1980 ANCALDGADY SGTYGITTSG NSLKLNFVTK GANTNVGSRT YLMAAGSKTQ YQLLKLLGQE FTFDVDVSNL PCGLNGALYF AEMDADGGVS RFPTNKAGAQ YGTGYCDAQC PQDIKFINGQ ANSVGWTPSS NDVNTGTGQY GSCCSEMDIW EANKISAAYT PHPCSVDGQT RCTGTDCGIG ARYSSLCDAD GCDFNSYRMG DTGFYGAGLT VDTSKVFTVV TQFITNDGTT SGTLSEIRRF YVQNGKVIPN SQSKVTGVSG NSITDSFCAA QKTAFGDTNE FATKGGLATM SKALAKGMVL VMSIWDDHSA NMLWLDAPYP ASKSPSAAGV SRGSCSASSG VPADVEANSP GASVTYSNIK WGPINSTYSA GTGSNTGSGS GSTTTLVSSV PSSTPTSTTG VPKYGQCGGS GYTGPTNCIG STCVSMGQYY SQCQ 116181754 Chaetomium globosum MYRQVATALS FASLVLGQQV GTLTAETHPS LPIEVCTAPG SCTKEDTTVV LDANWRWTHV TDGYTNCYTG CBS 148-51 NAWNETACPD GKTCAANCAI DGAEYEKTYG ITTPEEGALR LNFVTESNVG SRVYLMAGED KYRLFNLLNK EFTMDVDVSN LPCGLNGAVY FSEMDEDGGM SRFEGNKAGA KYGTGYCDSQ CPRDIKFING EANSEGWGGE DGNSGTGKYG TCCAEMDIWE ANLDATAYTP HPCKVTEQTR CEDDTECGAG DARYEGLCDR DGCDFNSFRL GNKEFYGPEK TVDTSKPFTL VTQFVTADGT DTGALQSIRR FYVQDGTVIP NSETVVEGVD PTNEITDDFC AQQKTAFGDN NHFKTIGGLP AMGKSLEKMV LVLSIWDDHA VYMNWLDSNY PTDADPTKPG VARGRCDPEA GVPETVEAAH PDAYVIYSNI KIGALNSTFA AA 145230535 Aspergillus niger MSSFQVYRAA LLLSILATAN AQQVGTYTTE THPSLTWQTC TSDGSCTTND GEVVIDANWR WVHSTSSATN CYTGNEWDTS ICTDDVTCAA NCALDGATYE ATYGVTTSGS ELRLNFVTQG SSKNIGSRLY LMSDDSNYEL FKLLGQEFTF DVDVSNLPCG LNGALYFVAM DADGGTSEYS GNKAGAKYGT GYCDSQCPRD LKFINGEANC DGWEPSSNNV NTGVGDHGSC CAEMDVWEAN SISNAFTAHP CDSVSQTMCD GDSCGGTYSA SGDRYSGTCD PDGCDYNPYR LGNTDFYGPG LTVDTNSPFT VVTQFITDDG TSSGTLTEIK RLYVQNGEVI ANGASTYSSV NGSSITSAFC ESEKTLFGDE NVFDKHGGLE GMGEAMAKGM VLVLSLWDDY AADMLWLDSD YPVNSSASTP GVARGTCSTD SGVPATVEAE SPNAYVTYSN IKFGPIGSTY SSGSSSGSGS SSSSSSTTTK ATSTTLKTTS TTSSGSSSTS AAQAYGQCGG QGWTGPTTCV SGYTCTYENA YYSQCL 46241266 Nectria haematococca MYRAIATASA LLATARAQQV CTLNTENKPA LTWAKCTSSG CSNVRGSVVV DANWRWAHST SSSTNCYTGN mpVI TWDKTLCPDG KTCADKCCLD GADYSGTYGV TSSGNQLNLK FVTVGPYSTN VGSRLYLMED ENNYQMFDLL GNEFTFDVDV NNIGCGLNGA LYFVSMDKDG GKSRFSTNKA GAKYGTGYCD AQCPRDVKFI NGVANSDEWK PSDSDKNAGV GKYGTCCPEM DIWEANKIST AYTPHPCKSL TQQSCEGDAC GGTYSATRYA GTCDPDGCDF NPYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FIKGSDGKLS EIKRLYVQNG KVIGNPQSEI ANNPGSSVTD SFCKAQKVAF NDPDDFNKKG GWSGMSDALA KPMVLVMSLW HDHYANMLWL DSTYPKGSKT PGSARGSCPE DSGDPDTLEK EVPNSGVSFS NIKFGPIGST YTGTGGSNPD PEEPEEPEEP VGTVPQYGQC GGINYSGPTA CVSPYKCNKI NDFYSQCQ 1q9h (PDB) # Talaromyces emersonii EQAGTATAEN HPPLTWQECT APGSCTTQNG AVVLDANWRW VHDVNGYTNC YTGNTWDPTY CPDDETCAQN CALDGADYEG TYGVTSSGSS LKLNFVTGSN VGSRLYLLQD DSTYQIFKLL NREFSFDVDV SNLPCGLNGA LYFVAMDADG GVSKYPNNKA GAKYGTGYCD SQCPRDLKFI DGEANVEGWQ PSSNNANTGI GDHGSCCAEM DVWEANSISN AVTPHPCDTP GQTMCSGDDC GGTYSNDRYA GTCDPDGCDF NPYRMGNTSF YGPGKIIDTT KPFTVVTQFL TDDGTDTGTL SEIKRFYIQN SNVIPQPNSD ISGVTGNSIT TEFCTAQKQA FGDTDDFSQH GGLAKMGAAM QQGMVLVMSL WDDYAAQMLW LDSDYPTDAD PTTPGIARGT CPTDSGVPSD VESQSPNSYV TYSNIKFGPI NSTFTAS 157362170 Polyporus arcularius MFPTLALVSL SFLAIAYGQQ VGTLTAETHP KLSVSQCTAG GSCTTVQRSV VLDSNWRWLH DVGGSTNCYT GNTWDDSLCP DPTTCAANCA LDGADYSGTY GITTSGNALS LKFVTQGPYS TNIGSRVYLL SEDDSTYEMF NLKNQEFTFD VDMSALPCGL NGALYFVEMD KDGGSGRFPT NKAGSKYGTG YCDTQCPHDI KFINGEANVL DWAGSSNDPN AGTGHYGTCC NEMDIWEANS MGAAVTPHVC TVQGQTRCEG TDCGDGDERY DGICDKDGCD FNSWRMGDQT FLGPGKTVDT SSKFTVVTQF ITADNTTSGD LSEIRRLYVQ NGKVIANSKT QIAGMDAYDS ITDDFCNAQK TTFGDTNTFE QMGGLATMGD AFETGMVLVM SIWDDHEAKM LWLDSDYPTD ADASAPGVSR GPCPTTSGDP TDVESQSPGA TVIFSNIKTG PIGSTFTS 7804885 Leptosphaeria MLSASKAAAI LAFCAHTASA WVVGDQQTET HPKLNWQRCT GKGRSSCTNV NGEVVIDANW RWLAHRSGYT maculans NCYTGSEWNQ SACPNNEACT KNCAIEGSDY AGTYGITTSG NQMNIKFITK RPYSTNIGAR TYLMKDEQNY EMFQLIGNEF TFDVDLSQRC GMNGALYFVS MPQKGQGAPG AKYGTGYCDA QCARDLKFVR GSANAEGWTK SASDPNSGVG KKGACCAQMD VWEANSAATA LTPHSCQPAG YSVCEDTNCG GTYSEDRYAG TCDANGCDFN PFRVGVKDFY GKGKTVDTTK KMTVVTQFVG SGNQLSEIKR FYVQDGKVIA NPEPTIPGME WCNTQKKVFQ EEAYPFNEFG GMASMSEGMS QGMVLVMSLW DDHYANMLWL DSNWPREADP AKPGVARRDC PTSGGKPSEV EAANPNAQVM FSNIKFGPIG STFAHAA 121852 Phanerochaete MFRTATLLAF TMAAMVFGQQ VGTNTARSHP ALTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT chrysosporium GNQWDATLCP DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN GKVIQNSSVK IPGIDPVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG IGYTGSTTCA SPYTCHVLNP YYSQCY 126013214 Penicillium decumbens MYQRALLFSA LMAGVSAQQV GTQKPETHPP LAWKECTSSG CTSKDGSVVI DANWRWVHSV DGYKNCYTGN EWDSTLCPDD ATCATNCAVD GADYAGTYGA TTEGDSLSIN FVTGSNIGSR FYLMEDENKY QMFKLLNKEF TFDVDVSTLP CGLNGALYFV SMDADGGMSK YETNKAGAKY GTGYCDSQCP RDLKFINGKG NVEGWKPSAN DKNAGVGPHG SCCAEMDIWE ANSISTALTP HPCDTNGQTI CEGDSCGGTY STTRYAGTCD PDGCDFNPFR MGNESFYGPG KMVDTKSKMT VVTQFITSDG TDTGSLKEIK RVYVQNGKVI ANSASDVSGI TGNSITSDFC TAQKKTFGDE DVFNKHGGLS GMGDALGEGM VLVMSLWDDH NSNMLWLDGE KYPTDAAASK AGVSRGTCST DSGKPSTVES ESGSAKVVFS NIKVGSIGST FSA 156048578 Sclerotinia MTSKIALASL FAAAYGQQIG TYTTETHPSL TWQSCTAKGS CTTQSGSIVL DGNWRWTHST TSSTNCYTGN sclerotiorum 1980 TWDATLCPDD ATCAQNCALD GADYSGTYGI TTSGDSLRLN FVTQTANKNV GSRVYLLADN THYKTFNLLN QEFTFDVDVS NLPCGLNGAV YFANLPADGG ISSTNKAGAQ YGTGYCDSQC PRDGKFINGK ANVDGWVPSS NNPNTGVGNY GSCCAEMDIW EANSISTAVT PHSCDTVTQT VCTGDNCGGT YSTTRYAGTC DPDGCDFNPY RQGNESFYGP GKTVDTNSVF TIVTQFLTTD GTSSGTLNEI KRFYVQNGKV IPNSESTISG VTGNSITTPF CTAQKTAFGD PTSFSDHGGL ASMSAAFEAG MVLVLSLWDD YYANMLWLDS TYPTTKTGAG GPRGTCSTSS GVPASVEASS PNAYVVYSNI KVGAINSTFG 156712278 Acremonium MYTKFAALAA LVATVRGQAA CSLTAETHPS LQWQKCTAPG SCTTVSGQVT IDANWRWLHQ TNSSTNCYTG thermophilum NEWDTSICSS DTDCATKCCL DGADYTGTYG VTASGNSLNL KFVTQGPYSK NIGSRMYLME SESKYQGFTL LGQEFTFDVD VSNLGCGLNG ALYFVSMDLD GGVSKYTTNK AGAKYGTGYC DSQCPRDLKF INGQANIDGW QPSSNDANAG LGNHGSCCSE MDIWEANKVS AAYTPHPCTT IGQTMCTGDD CGGTYSSDRY AGICDPDGCD FNSYRMGDTS FYGPGKTVDT GSKFTVVTQF LTGSDGNLSE IKRFYVQNGK VIPNSESKIA GVSGNSITTD FCTAQKTAFG DTNVFEERGG LAQMGKALAE PMVLVLSVWD DHAVNMLWLD STYPTDSTKP GAARGDCPIT SGVPADVESQ APNSNVIYSN IRFGPINSTY TGTPSGGNPP GGGTTTTTTT TTSKPSGPTT TTNPSGPQQT HWGQCGGQGW TGPTVCQSPY TCKYSNDWYS QCL 21449327 Aspergillus nidulans MYQRALLFSA LLSVSRAQQA GTAQEEVHPS LTWQRCEASG SCTEVAGSVV LDSNWRWTHS VDGYTNCYTG (also known as NEWDATLCPD NESCAQNCAV DGADYEATYG ITSNGDSLTL KFVTGSNVGS RVYLMEDDET YQMFDLLNNE Emericella nidulans) FTFDVDVSNF PCGLNGALYF TSMDADGGLS KYEGNTAGAK YGTGYCDSQC PRDIKFINGL GNVEGWEPSD SDANAGVGGM GTCCPEMDIW EANSISTAYT PHPCDSVEQT MCEGDSCGGT YSDDRYGGTC DPDGCDFNSY RMGNTRFYGP GAIIDTSSKF TVVTQFIADG GSLSEIKRFY VQNGEVIPNS ESNISGVEGN SITSEFCTAQ KTAFGDEDIF AQHGGLSAMG DAASAMVLIL SIWDDHHSSM MWLDSSYPTD ADPSQPGVAR GTCEQGAGDP DVVESEHADA SVTFSNIKFG PIGSTF 171683762 Podospora anserine (S MMMKQYLQYL AAGSLMTGLV AGQGVGTQQT ETHPRITWKR CTGKANCTTV QAEVVIDSNW RWIHTSGGTN mat+) CYDGNAWNTA ACSTATDCAS KCLMEGAGNY QQTYGASTSG DSLTLKFVTK HEYGTNVGSR FYLMNGASKY QMFTLMNNEF TFDVDLSTVE CGLNSALYFV AMEEDGGMRS YPTNKAGAKY GTGYCDAQCA RDLKFVGGKA NIEGWRESSN DENAGVGPYG GCCAEIDVWE SNAHAYAFTP HACENNNYHV CERDTCGGTY SEDRFAGGCD ANGCDYNPYR MGNPDFYGKG KTVDTTKKFT VVTRFQDDNL EQFFVQNGQK ILAPAPTFDG IPASPNLTPE FCSTQFDVFT DRNRFREVGD FPQLNAALRI PMVLVMSIWA DHYANMLWLD SVYPPEKEGE PGAARGPCAQ DSGVPSEVKA NYPNAKVVWS NIRFGPIGST VNV 56718412 Thermoascus MYQRALLFSF FLAAARAQQA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG aurantiacus var NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL levisporus GQEFTFDVDV SNLPCGLNGA LYFVAMDADG GLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSCCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF NPYRQGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW LDSTYPTDAD PDTPGVARGT CPTTSGVPAD VESQNPNSYV IYSNIKVGPI NSTFTAN 15824273 Pseudotrichonympha MFAIVLLGLT RSLGTGTNQA ENHPSLSWQN CRSGGSCTQT SGSVVLDSNW RWTHDSSLTN CYDGNEWSSS grassii LCPDPKTCSD NCLIDGADYS GTYGITSSGN SLKLVFVTNG PYSTNIGSRV YLLKDESHYQ IFDLKNKEFT FTVDDSNLDC GLNGALYFVS MDEDGGTSRF SSNKAGAKYG TGYCDAQCPH DIKFINGEAN VENWKPQTND ENAGNGRYGA CCTEMDIWEA NKYATAYTPH ICTVNGEYRC DGSECGDTDS GNRYGGVCDK DGCDFNSYRM GNTSFWGPGL IIDTGKPVTV VTQFVTKDGT DNGQLSEIRR KYVQGGKVIE NTVVNIAGMS SGNSITDDFC NEQKSAFGDT NDFEKKGGLS GLGKAFDYGM VLVLSLWDDH QVNMLWLDSI YPTDQPASQP GVKRGPCATS SGAPSDVESQ HPDSSVTFSD IRFGPIDSTY 115390801 Aspergillus terreus MHQRALLFSA LVGAVRAQQA GTLTEEVHPP LTWQKCTADG SCTEQSGSVV IDSNWRWLHS TNGSTNCYTG NIH2624 NTWDESLCPD NEACAANCAL DGADYESTYG ITTSGDALTL TFVTGENVGS RVYLMAEDDE SYQTFDLVGN EFTFDVDVSN LPCGLNGALY FTSMDADGGV SKYPANKAGA KYGTGYCDSQ CPRDLKFING MANVEGWTPS DNDKNAGVGG HGSCCPELDI WEANSISSAF TPHPCDDLGQ TMCSGDDCGG TYSETRYAGT CDPDGCDFNA YRMGNTSYYG PDKIVDTNSV MTVVTQFIGD GGSLSEIKRL YVQNGKVIAN AQSNVDGVTG NSITSDFCTA QKTAFGDQDI FSKHGGLSGM GDAMSAMVLI LSIWDDHNSS MMWLDSTYPE DADASEPGVA RGTCEHGVGD PETVESQHPG ATVTFSKIKF GPIGSTYSSN STA 453223 Phanerochaete MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT chrysosporium GNEWDTSLCP DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDGDGCDF NSFRMGDKTF LGKGMTVDTS KPFTVVTQFL TNDNTSTGTL SEIRRIYIQN GKVIQNSVAN IPGVDPVNSI TDNFCAQQKT AFGDTNWFAQ KGGLKQMGEA LGNGMVLALS IWDDHAANML WLDSDYPTDK DPSAPGVARG TCATTSGVPS DVESQVPNSQ VVFSNIKFGD IGSTFSGTSS PNPPGGSTTS SPVTTSPTPP PTGPTVPQWG QCGGIGYSGS TTCASPYTCH VLNPCESILS LQRSSNADQY LQTTRSATKR RLDTALQPRK 3132 Phanerochaete MRTALALILA LAAFSAVSAQ QAGTITAETH PTLTIQQCTQ SGGCAPLTTK VVLDVNWRWI HSTTGYTNCY chrysosporium SGNTWDAILC PDPVTCAANC ALDGADYTGT FGILPSGTSV TLRPVDGLGL RLFLLADDSH YQMFQLLNKE FTFDVEMPNM RCGSSGAIHL TAMDADGGLA KYPGNQAGAK YGTGFCSAQC PKGVKFINGQ ANVEGWLGTT ATTGTGFFGS CCTDIALWEA NDNSASFAPH PCTTNSQTRC SGSDCTADSG LCDADGCNFN SFRMGNTTFF GAGMSVDTTK LFTVVTQFIT SDNTSMGALV EIHRLYIQNG QVIQNSVVNI PGINPATSIT DDLCAQENAA FGGTSSFAQH GGLAQVGEAL RSGMVLALSI VNSAADTLWL DSNYPADADP SAPGVARGTC PQDSASIPEA PTPSVVFSNI KLGDIGTTFG AGSALFSGRS PPGPVPGSAP ASSATATAPP FGSQCGGLGY AGPTGVCPSP YTCQALNIYY SQCI 16304152 Thermoascus MYQRALLFSF FLAAARAHEA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG aurantiacus NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA LYFVAMDADG NLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSSCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDTDGCDF NPYQPGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FDNTGFFTHG GLQKISQALA QGMVLVMSLW DDHAANMLWL DSTYPTDADP DTPGVARGTC PTTSGVPADV ESQNPNSYVI YSNIKVGPIN STFTAN 156712280 Acremonium MHKRAATLSA LVVAAAGFAR GQGVGTQQTE THPKLTFQKC SAAGSCTTQN GEVVIDANWR WVHDKNGYTN thermophilum CYTGNEWNTT ICADAASCAS NCVVDGADYQ GTYGASTSGN ALTLKFVTKG SYATNIGSRM YLMASPTKYA MFTLLGHEFA FDVDLSKLPC GLNGAVYFVS MDEDGGTSKY PSNKAGAKYG TGYCDSQCPR DLKFIDGKAN SASWQPSSND QNAGVGGMGS CCAEMDIWEA NSVSAAYTPH PCQNYQQHSC SGDDCGGTYS ATRFAGDCDP DGCDWNAYRM GVHDFYGNGK TVDTGKKFSI VTQFKGSGST LTEIKQFYVQ DGRKIENPNA TWPGLEPFNS ITPDFCKAQK QVFGDPDRFN DMGGFTNMAK ALANPMVLVL SLWDDHYSNM LWLDSTYPTD ADPSAPGKGR GTCDTSSGVP SDVESKNGDA TVIYSNIKFG PLDSTYTAS 5231154 Volvariella volvacea MRASLLAFSL NSAAGQQAGT LQTKNHPSLT SQKCRQGGCP QVNTTIVLDA NWRWTHSTSG STNCYTGNTW QATLCPDGKT CAANCALDGA DYTGTYGVTT SGNSLTLQFV TQSNVGARLG YLMADDTTYQ MFNLLNQEFW FDVDMSNLPC GLNGALYFSA MARTAAWMPM VVCASTPLIS TRRSTARLLR LPVPPRSRYG RGICDSQCPR DIKFINGEAN VQGWQPSPND TNAGTGNYGA CCNKMDVWEA NSISTAYTPH PCTQRGLVRC SGTACGGGSN RYGSICDHDG LGFQNLFGMG RTRVRARVGR VKQFNRSSRV VEPISWTKQT TLHLGNLPWK SADCNVQNGR VIQNSKVNIP GMPSTMDSVT TEFCNAQKTA FNDTFSFQQK GGMANMSEAL RRGMVLVLSI WDDHAANMLW LDSITSAAAC RSTPSEVHAT PLRESQIRSS HSRQTRYVTF TNIKFGPFNS TGTTYTTGSV PTTSTSTGTT GSSTPPQPTG VTVPQGQCGG IGYTGPTTCA SPTTCHVLNP YYSQCY 116200349 Chaetomium globosum MKQYLQYLAA ALPLMSLVSA QGVGTSTSET HPKITWKKCS SGGSCSTVNA EVVIDANWRW LHNADSKNCY CBS 148-51 DGNEWTDACT SSDDCTSKCV LEGAEYGKTY GASTSGDSLS LKFLTKHEYG TNIGSRFYLM NGASKYQMFT LMNNEFAFDV DLSTVECGLN SALYFVAMEE DGGMASYSTN KAGAKYGTGY CDAQCARDLK FVGGKANYDG WTPSSNDANA GVGALGGCCA EIDVWESNAH AFAFTPHACE NNNYHVCEDT TCGGTYSEDR FAGDCDANGC DYNPYRVGNT DFYGKGMTVD TSKKFTVVSQ FQENKLTQFF VQNGKKIEIP GPKHEGLPTE SSDITPELCS AMPEVFGDRD RFAEVGGFDA LNKALAVPMV LVMSIWDDHY ANMLWLDSSY PPEKAGTPGG DRGPCAQDSG VPSEVESQYP DATVVWSNIR FGPIGSTVQV 4586343 Irpex lacteus MFPKASLIAL SFIAAVYGQQ VGTQMAEVHP KLPSQLCTKS GCTNQNTAVV LDANWRWLHT TSGYTNCYTG NSWDATLCPD ATTCAQNCAV DGADYSGTYG ITTSGNALTL KFKTGTNVGS RVYLMQTDTA YQMFQLLNQE FTFDVDMSNL PCGLNGALYL SQMDQDGGLS KFPTNKAGAK YGTGYCDSQC PHDIKFINGM ANVAGWAGSA SDPNAGSGTL GTCCSEMDIW EANNDAAAFT PHPCSVDGQT QCSGTQCGDD DERYSGLCDK DGCDFNSFRM GDKSFLGKGM TVDTSRKFTV VTQFVTTDGT TNGDLHEIRR LYVQDGKVIQ NSVVSIPGID AVDSITDNFC AQQKSVFGDT NYFATLGGLK KMGAALKSGM VLAMSVWDDH AASMQWLDSN YPADGDATKP GVARGTCSAD SGLPTNVESQ SASASVTFSN IKWGDINTTF TGTGSTSPSS PAGPVSSSTS VASQPTQPAQ GTVAQWGQCG GTGFTGPTVC ASPFTCHVVNPYYSQCY 15321718 Lentinula edodes MFRTAALLSF AYLAVVYGQQ AGTSTAETHP PLTWEQCTSG GSCTTQSSSV VLDSNWRWTH VVGGYTNCYT GNEWNTTVCP DGTTCAANCA LDGADYEGTY GISTSGNALT LKFVTASAQT NVGSRVYLMA PGSETEYQMF NPLNQEFTFD VDVSALPCGL NGALYFSEMD ADGGLSEYPT NKAGAKYGTG YCDSQCPRDI KFIEGKANVE GWTPSSTSPN AGTGGTGICC NEMDIWEANS ISEALTPHPC TAQGGTACTG DSCSSPNSTA GICDQAGCDF NSFRMGDTSF YGPGLTVDTT SKITVVTQFI TSDNTTTGDL TAIRRIYVQN GQVIQNSMSN IAGVTPTNEI TTDFCDQQKT AFGDTNTFSE KGGLTGMGAA FSRGMVLVLS IWDDDAAEML WLDSTYPVGK TGPGAARGTC ATTSGQPDQV ETQSPNAQVV FSNIKFGAIG STFSSTGTGT GTGTGTGTGT GTTTSSAPAA TQTKYGQCGG QGWTGATVCA SGSTCTSSGP YYSQCL 146424875 Pleurotus sp Florida MFRTAALTAF TFAAVVLGQQ VGTLTTENHP ALSIQQCTAT GCTTQQKSVV LDSNWRWTHS TAGATNCYTG NAWDPALCPD PATCATNCAI DGADYSGTYG ITTSGNALTL RFVTNGQYSQ NIGSRVYLLD DADHYKLFDL KNQEFTFDVD MSGLPCGLNG ALYFSEMAAD GGKAAHAGNN AGAKYGTGYC DAQCPHDIKW INGEANVLDW SASATDDNAG NGRYGACCAE MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGNNRYGG VCDKDGCDFN SYRMGDKNFL GRGKTIDTTK KVTVVTQFIT DNNTPTGNLV EIRRVYVQNG VVYQNSFSTF PSLSQYNSIS DEFCVAQKTL FGDNQYYNTH GGTTKMGDAF DNGMVLIMSL WSDHAAHMLW LDSDYPLDKS PSEPGVSRGA CPTSSGDPDD VVANHPNASV TFSNIKYGPI GSTFGGSTPP VSSGGSSVPP VTSTTSSGTT TPTGPTGTVP KWGQCGGIGY SGPTACVAGS TCTYSNDWYS QCL 62006158 Fusarium venenatum MYRAIATASA LIAAVRAQQV CSLTPETKPA LSWSKCTSSG CSNVQGSVTI DANWRWTHQL SGSTNCYTGN KWDTSICTSG KVCAEKCCID GAEYASTYGI TSSGNQLSLS FVTKGTYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSKSDVNGGI GNLGTCCPEM DIWEANSIST AHTPHPCTKL TQHSCTGDSC GGTYSEDRYG GTCDADGCDF NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGSSLTP EFCTAQKKVF GDIDDFEKKG AWGGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGAQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKEGQPEPTN PTNPNPTTPG GTVDQWGQCG GTNYSGPTAC KSPFTCKKIN DFYSQCQ 296027 Phanerochaete MFRTATLLAF TMAAMVFGQQ VGTNTAENHR TLTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT chrysosporium GNQWDATLCP DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN GKVIQNSSVK IPGIDLVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG IGYTGSTTCA SPYTCHVLNP YYSQCY 154449709 Fusicoccum sp MYQTSLLASL SFLLATSQAQ QVGTQTAETH PKLTTQKCTT AGGCTDQSTS IVLDANWRWL HTVDGYTNCY BCC4124 TGQEWDTSIC TDGKTCAEKC ALDGADYEST YGISTSGNAL TMNFVTKSSQ TNIGGRVYLL AADSDDTYEL FKLKNQEFTF DVDVSNLPCG LNGALYFSEM DSDGGLSKYT TNKAGAKYGT GYCDTQCPHD IKFINGEANV QNWTASSTDK NAGTGHYGSC CNEMDIWEAN SQATAFTPHV CEAKVEGQYR CEGTECGDGD NRYGGVCDKD GCDFNSYRMG NETFYGSNGS TIDTTKKFTV VTQFITADNT ATGALTEIRR KYVQNDVVIE NSYADYETLS KFNSITDDFC AAQKTLSGDT NDFKTKGGIA RMGESFERGM VLVMSVWDDH AANALWLDSS YPTDADASKP GVKRGPCSTS SGVPSDVEAN DADSSVIYSN IRYGDIGSTF NKTA 169859460 Coprinopsis cinerea MFSKVALTAL CFLAVAQAQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY okayama TGNAWNSSVC SDGATCAQRC ALEGANYQQT YGITTSGDAL TIKFLTRSEQ TNIGARVYLM ENEDRYQMFN LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGLSSQPNN RAGAKYGTGY CDSQCPRDIK FINGEANSVG WEPSETDPNA GKGQYGICCA EMDIWEANSI SNAYTPHPCQ TVNDGGYQRC QGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTVDT NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITQEFCDDA KRAFEDNDSF GRNGGLAHMG RSLAKGHVLA LSIWNDHTAH MLWLDSNYPT DADPNKPGIA RGTCPTTGGS PRDTEQNHPD AQVIFSNIKF GDIGSTFSGN 50400675 Trichoderma MYRKLAVISA FLAAARAQQV CTQQAETHPP LTWQKCTASG CTPQQGSVVL DANWRWTHDT KSTTNCYDGN harzianum (anamorph TWSSTLCPDD ATCAKNCCLD GANYSGTYGV TTSGDALTLQ FVTASNVGSR LYLMANDSTY QEFTLSGNEF of Hypocrea lixii) SFDVDVSQLP CGLNGALYFV SMDADGGQSK YPGNAAGAKY GTGYCDSQCP RDLKFINGQA NVEGWEPSSN NANTGVGGHG SCCSEMDIWE ANSISEALTP HPCETVGQTM CSGDSCGGTY SNDRYGGTCD PDGCDWNPYR LGNTSFYGPG SSFALDTTKK LTVVTQFATD GSISRYYVQN GVKFQQPNAQ VGSYSGNTIN TDYCAAEQTA FGGTSFTDKG GLAQINKAFQ GGMVLVMSLW DDYAVNMLWL DSTYPTNATA STPGAKRGSC STSSGVPAQV EAQSPNSKVI YSNIRFGPIG STGGNTGSNP PGTSTTRAPP SSTGSSPTAT QTHYGQCGGT GWTGPTRCAS GYTCQVLNPF YSQCL 729649 Neurospora crassa MRASLLAFSL AAAVAGGQQA GTLTAKRHPS LTWQKCTRGG CPTLNTTMVL DANWRWTHAT SGSTKCYTGN (OR74A) KWQATLCPDG KSCAANCALD GADYTGTYGI TGSGWSLTLQ FVTDNVGARA YLMADDTQYQ MLELLNQELW FDVDMSNIPC GLNGALYLSA MDADGGMRKY PTNKAGAKYA TGYCDAQCPR DLKYINGIAN VEGWTPSTND ANGIGDHGSC CSEMDIWEAN KVSTAFTPHP CTTIEQHMCE GDSCGGTYSD DRYGVLCDAD GCDFNSYRMG NTTFYGEGKT VDTSSKFTVV TQFIKDSAGD LAEIKAFYVQ NGKVIENSQS NVDGVSGNSI TQSFCKSQKT AFGDIDDFNK KGGLKQMGKA LAQAMVLVMS IWDDHAANML WLDSTYPVPK VPGAYRGSGP TTSGVPAEVD ANAPNSKVAF SNIKFGHLGI SPFSGGSSGT PPSNPSSSAS PTSSTAKPSS TSTASNPSGT GAAHWAQCGG IGFSGPTTCP EPYTCAKDHD IYSQCV 119472134 Neosartorya fischeri MLASTFSYRM YKTALILAAL LGSGQAQQVG TSQAEVHPSM TWQSCTAGGS CTTNNGKVVI DANWRWVHKV NRRL 181 GDYTNCYTGN TWDKTLCPDD ATCASNCALE GANYQSTYGA TTSGDSLRLN FVTTSQQKNI GSRLYMMKDD TTYEMFKLLN QEFTFDVDVS NLPCGLNGAL YFVAMDADGG MSKYPTNKAG AKYGTGYCDS QCPRDLKFIN GQANVEGWQP SSNDANAGTG NHGSCCAEMD IWEANSISTA FTPHPCDTPG QVMCTGDACG GTYSSDRYGG TCDPDGCDFN SFRQGNKTFY GPGMTVDTKS KFTVVTQFIT DDGTASGTLK EIKRFYVQNG KVIPNSESTW SGVGGNSITN DYCTAQKSLF KDQNVFAKHG GMEGMGAALA QGMVLVMSLW DDHAANMLWL DSNYPTTASS STPGVARGTC DISSGVPADV EANHPDASVV YSNIKVGPIG STFNSGGSNP GGGTTTTAKP TTTTTTAGSP GGTGVAQHYG QCGGNGWQGP TTCASPYTCQ KLNDFYSQCL 117935080 Chaetomium MQIKQYLQYL AAALPLVNMA AAQRAGTQQT ETHPRLSWKR CSSGGNCQTV NAEIVIDANW thermophilum RWLHDSNYQN CYDGNRWTSA CSSATDCAQK CYLEGANYGS TYGVSTSGDA LTLKFVTKHE YGTNIGSRVY LMNGSDKYQM FTLMNNEFAF DVDLSKVECG LNSALYFVAM EEDGGMRSYS SNKAGAKYGT GYCDAQCARD LKFVGGKANI EGWRPSTNDA NAGVGPYGAC CAEIDVWESN AYAFAFTPHG CLNNNYHVCE TSNCGGTYSE DRFGGLCDAN GCDYNPYRMG NKDFYGKGKT VDTSRKFTVV TRFEENKLTQ FFIQDGRKID IPPPTWPGLP NSSAITPELC TNLSKVFDDR DRYEETGGFR TINEALRIPM VLVMSIWDGH YASMLWLDSV YPPEKAGQPG AERGPCAPTS GVPAEVEAQF PNAQVIWSNI RFGPIGSTYQ V 154300584 Botryotinia fuckeliana MTSRIALVSL FAAVYGQQVG TYQTETHPSL TWQSCTAKGS CTTNTGSIVL DGNWRWTHGV B05-10 GTSTNCYTGN TWDATLCPDD ATCAQNCALE GADYSGTYGI TTSGNSLRLN FVTQSANKNI GSRVYLMADT THYKTFNLLN QEFTFDVDVS NLPCGLNGAV YFANLPADGG ISSTNTAGAE YGTGYCDSQC PRDMKFIKGQ ANVDGWVPSS NNANTGVGNH GSCCAEMDIW EANSISTAVT PHSCDTVTQT VCTGDDCGGT YSSSRYAGTC DPDGCDFNSY RMGDETFYGP GKTVDTNSVF TVVTQFLTTD GTASGTLNEI KRFYVQDGKV IPNSYSTISG VSGNSITTPF CDAQKTAFGD PTSFSDHGGL ASMSAAFEAG MVLVLSLWDD YYANMLWLDS TYPVGKTSAG GPRGTCDTSS GVPASVEASS PNAYVVYSNI KVGAINSTYG 15824271 Pseudotrichonympha MFVFVLLWLT QSLGTGTNQA ENHPSLSWQN CRSGGSCTQT SGSVVLDSNW RWTHDSSLTN grassii CYDGNEWSSS LCPDPKTCSD NCLIDGADYS GTYGITSSGN SLKLVFVTNG PYSTNIGSRV YLLKDESHYQ IFDLKNKEFT FTVDDSNLDC GLNGALYFVS MDEDGGTSRF SSNKAGAKYG TGYCDAQCPH DIKFINGEAN VENWKPQTND ENAGNGRYGA CCTEMDIWEA NKYATAYTPH ICTVNGEYRC DGSECGDTDS GNRYGGVCDK DGCDFNSYRM GNTSFWGPGL IIDTGKPVTV VTQFVTKDGT DNGQLSEIRR KYVQGGKVIE NTVVNIAGMS SGNSITDDFC NEQKSAFGDT NDFEKKGGLS GLGKAFDYGM VLVLSLWDDH QVNMLWLDSI YPTDQPASQP GVKRGPCATS SGAPSDVESQ HPDSSVTFSD IRFGPIDSTY 4586345 Irpex lacteus MFRKAALLAF SFLAIAHGQQ VGTNQAENHP SLPSQKCTAS GCTTSSTSVV LDANWRWVHT TTGYTNCYTG QTWDASICPD GVTCAKACAL DGADYSGTYG ITTSGNALTL QFVKGTNVGS RVYLLQDASN YQMFQLINQE FTFDVDMSNL PCGLNGAVYL SQMDQDGGVS RFPTNTAGAK YGTGYCDSQC PRDIKFINGE ANVEGWTGSS TDSNSGTGNY GTCCSEMDIW EANSVAAAYT PHPCSVNQQT RCTGADCGQG DDRYDGVCDP DGCDFNSFRM GDQTFLGKGL TVDTSRKFTI VTQFISDDGT TSGNLAEIRR FYVQDGNVIP NSKVSIAGID AVNSITDDFC TQQKTAFGDT NRFAAQGGLK QMGAALKSGM VLALSLWDDH AANMLWLDSD YPTTADASNP GVARGTCPTT SGFPRDVESQ SGSATVTYSN IKWGDLNSTF TGTLTTPSGS SSPSSPASTS GSSTSASSSA SVPTQSGTVA QWAQCGGIGY SGATTCVSPY TCHVVNAYYS QCY 46241268 Gibberella avenacea MYRAIATASA LIAAARAQQV CTLTTETKPA LTWSKCTSSG CTDVKGSVGI DANWRWTHQT SSSTNCYTGN KWDTSVCTSG ETCAQKCCLD GADYAGTYGI TSSGNQLSLG FVTKGSFSTN IGSRTYLMEN ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKARYPANKA GAKYGTGYCD AQCPRDVKFI NGKANSDGWK PSDSDINAGI GNMGTCCPEM DIWEANSIST AFTPHPCTKL TQHACTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF YGRGSDFNVD TTKKVTVVTQ FKKGSNGRLS EITRLYVQNG KVIANSESKI PGNSGSSLTA DFCSKQKSVF GDIDDFSKKG GWSGMSDALE SPPMVLVMSL WHDHHSNMLW LDSTYPTDST KLGAQRGSCA TTSGVPSDLE RDVPNSKVSF SNIKFGPIGS TYSSGTTNPP PSSTDTSTTP TNPPTGGTVG QYGQCGGQTY TGPKDCKSPY TCKKINDFYS QCQ 6164684 Aspergillus niger MSSFQIYRAA LLLSILATAN AQQVGTYTTE THPSLTWQTC TSDGSCTTND GEVVIDANWR WVHSTSSATN CYTGNEWDTS ICTDDVTCAA NCALDGATYE ATYGVTTSGS ELRLNFVTQG SSKNIGSRLY LMSDDSNYEL FKLLGQEFTF DVDVSNLPCG LNGALYFVAM DADGGTSEYS GNKAGAKYGT GYCDSQCPRD LKFINGEANC DGWEPSSNNV NTGVGDHGSC CAEMDVWEAN SISNAFTAHP CDSVSQTMCD GDSCGGTYSA SGDRYSGTCD PDGCDYNPYR LGNTDFYGPG LTVDTNSPFT VVTQFITDDG TSSGTLTEIK RLYVQNGEVI ANGASTYSSV NGSSITSAFC ESEKTLFGDE NVFDKHGGLE GMGEAMAKGM VLVLSLWDDY AADMLWLDSD YPVNSSASTP GVARGTCSTD SGVPATVEAE SPNAYVTYSN IKFGPIGSTY SSGSSSGSGS SSSSSSTTTK ATSTTLKTTS TTSSGSSSTS AAQAYGQCGG QGWTGPTTCV SGYTCTYENA YYSQCL 6164682 Aspergillus niger MHQRALLFSA LLTAVRAQQA GTLTEEVHPS LTWQKCTSEG SCTEQSGSVV IDSNWRWTHS VNDSTNCYTG NTWDATLCPD DETCAANCAL DGADYESTYG VTTDGDSLTL KFVTGSNVGS RLYLMDTSDE GYQTFNLLDA EFTFDVDVSN LPCGLNGALY FTAMDADGGV SKYPANKAGA KYGTGYCDSQ CPRDLKFIDG QANVDGWEPS SNNDNTGIGN HGSCCPEMDI WEANKISTAL TPHPCDSSEQ TMCEGNDCGG TYSDDRYGGT CDPDGCDFNP YRMGNDSFYG PGKTIDTGSK MTVVTQFITD GSGSLSEIKR YYVQNGNVIA NADSNISGVT GNSITTDFCT AQKKAFGDED IFAEHNGLAG ISDAMSSMVL ILSLWDDYYA SMEWLDSDYP ENATATDPGV ARGTCDSESG VPATVEGAHP DSSVTFSNIK FGPINSTFSA SA 33733371 Chrysosporium MYAKFATLAA LVAGAAAQNA CTLTAENHPS LTWSKCTSGG SCTSVQGSIT IDANWRWTHR TDSATNCYEG lucknowense NKWDTSYCSD GPSCASKCCI DGADYSSTYG ITTSGNSLNL KFVTKGQYST NIGSRTYLME SDTKYQMFQL U.S. Pat. No. 6,573,086-10 LGNEFTFDVD VSNLGCGLNG ALYFVSMDAD GGMSKYSGNK AGAKYGTGYC DSQCPRDLKF INGEANVENW QSSTNDANAG TGKYGSCCSE MDVWEANNMA AAFTPHPCXV IGQSRCEGDS CGGTYSTDRY AGICDPDGCD FNSYRQGNKT FYGKGMTVDT TKKITVVTQF LKNSAGELSE IKRFYVQNGK VIPNSESTIP GVEGNSITQD WCDRQKAAFG DVTDXQDKGG MVQMGKALAG PMVLVMSIWD DHAVNMLWLD STWPIDGAGK PGAERGACPT TSGVPAEVEA EAPNSNVIFS NIRFGPIGST VSGLPDGGSG NPNPPVSSST PVPSSSTTSS GSSGPTGGTG VAKHYEQCGG IGFTGPTQCE SPYTCTKLND WYSQCL 29160311 Thielavia australiensis MYAKFATLAA LVAGASAQAV CSLTAETHPS LTWQKCTAPG SCTNVAGSIT IDANWRWTHQ TSSATNCYSG SKWDSSICTT GTDCASKCCI DGAEYSSTYG ITTSGNALNL KFVTKGQYST NIGSRTYLME SDTKYQMFKL LGNEFTFDVD VSNLGCGLNG ALYFVSMDAD GGMSKYSGNK AGAKYGTGYC DAQCPRDLKF INGEANVEGW ESSTNDANAG SGKYGSCCTE MDVWEANNMA TAFTPHPCTT IGQTRCEGDT CGGTYSSDRY AGVCDPDGCD FNSYRQGNKT FYGKGMTVDT TKKITVVTQF LKNSAGELSE IKRFYAQDGK VIPNSESTIA GIPGNSITKA YCDAQKTVFQ NTDDFTAKGG LVQMGKALAG DMVLVMSVWD DHAVNMLWLD STYPTDQVGV AGAERGACPT TSGVPSDVEA NAPNSNVIFS NIRFGPIGST VQGLPSSGGT SSSSSAAPQS TSTKASTTTS AVRTTSTATT KTTSSAPAQG TNTAKHWQQC GGNGWTGPTV CESPYKCTKQ NDWYSQCL 146197087 uncultured symbiotic MLTLVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSKDLCP protist of SSDTCSQKCY IEGADYSGTY GIQSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYESFK LKNKEFTFTV Reticulitermes DDSKLNCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS speratus GNGKLGTCCS EMDIWEGNMK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNNGQ VLVMSLWDDH TANMLWLDST YPTDSSDSTA QRGPCPTSSG VPKDVESQHG DATVVFSDIK FGAINSTFKY N 146197237 uncultured symbiotic MLAAALFTFA CSVGVGTKTP ENHPKLNWQN CASKGSCSQV SGEVTMDSNW RWTHDGNGKN CYDGNTWISS protist of Neotermes LCPDDKTCSD KCVLDGAEYQ ATYGIQSNGT ALTLKFVTHG SYSTNIGSRL YLLKDKSTYY VFKLNNKEFT koshunensis FSVDVSKLPC GLNGALYFVE MDADGGKAKY AGAKPGAEYG LGYCDAQCPS DLKFINGEAN SEGWKPQSGD KNAGNGKYGS CCSEMDVWES NSQATALTPH VCKTTGQQRC SGKSECGGQD GQDRFAGLCD EDGCDFNNWR MGDKTFFGPG LIVDTKSPFV VVTQFYGSPV TEIRRKYVQN GKVIENSKSN IPGIDATAAI SDHFCEQQKK AFGDTNDFKN KGGFAKLGQV FDRGMVLVLS LWDDHQVAML WLDSTYPTNK DKSQPGVDRG PCPTSSGKPD DVESASADAT VVYGNIKFGA LDSTY 146197067 uncultured symbiotic MLTLVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSKDLCP protist of SSNTCSQKCY IEGADYSGTY GIQSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYESFK LKNKEFTFTV Reticulitermes DDSKLNCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS speratus GNGKLGTCCS EMDIWEGNMK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNNGQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVSSG VPKDVESQYG DATVIYSDIK FGAINSTFKW N 146197407 uncultured symbiotic MILALLSLAK SLGIATNQAE THPKLTWTRY QSKGSGQTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC protist of Cryptocercus PDPTTCSNNC DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT punctulatus VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDANQ RYNGICDKDG CDFNSYRLGD KTFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNIAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG APSDVESQSP DATVTFSDIK FGPIDSTY 146197157 uncultured symbiotic MLVIALILRG LSVGTGTQQS ETHPSLSWQQ TSKGGSGQSV SGSVVLDSNW RWTHTTDGTT NCYDGNEWSS protist of DLCPDASTCS SNCVLEGADY SGTYGITGSG SSLKLGFVTK GSYSTNIGSR VYLLGDESHY KLFKLENNEF Hodotermopsis TFTVDDSNLE CGLNGALYFV AMDEDGGASK YSGAKPGAKY GMGYCDAQCP HDMKFINGDA NVEGWKPSDN sjoestedti DENAGTGKWG ACCTEMDIWE ANKYATAYTP HICTKNGEYR CEGTDCGDTK DNNRYGGVCD KDGCDFNSWR MGNQSFWGPG LIIDTGKPVT VVTQFLADGG SLSEIRRKYV QGGKVIENTV TKISGMDEFD SITDEFCNQQ KKAFRDTNDF EKKGGLKGLG TAVDAGVVLV LSLWDDHDVN MLWLDSIYPT DSGSKAGADR GPCATSSGVP KDVESNYASA SVTFSDIKFG PIDSTY 146197403 uncultured symbiotic MLLALFAFGK SLGIATNQAE NHPKLTWTRY QSKGSGQTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC protist of Cryptocercus PDPTTCSNNC DLDGADYPGT YGISSSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT punctulatus VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGIRRCEG TECGDTDANQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNIAGMAAG NSITDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDSGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG APSDVESQSP DATVTFSDIK FGPIDSTY 146197081 uncultured symbiotic MLASVVYLVS LVVSLEIGTQ QSEEHPKLTW QNGSSSVSGS IVLDSNWRWL HDSGTTNCYD GNLWSDDLCP protist of NADTCSSKCY IEGADYSGTY GITSSGSKVT LKFVTKGSYS TNIGSRIYLL KDENTYETFK LKNKEFTFTV Reticulitermes DDSKLDCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS speratus GDGKLGTCCS EMDIWEGNAK SQAYTVHACS KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKSPVTVVTQ FIGDPLTEIR RVYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKDATGDT NDFKAKGAMA GFSTNLNTAQ VLVSVHCGMI IQPICCGLIR RIQRIQQKQV QAVDRVLCRR VFQRMLKASM VMLQSRTRTL SLELSTRPLV GISPAGRLFF F 146197413 uncultured symbiotic MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC protist of Cryptocercus PDPTTCSNNC DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LKDTKSYEMF KLKNKEFTFT punctulatus VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK FGPIDSTY 146197309 uncultured symbiotic MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD protist of Mastotermes AATCGKNCVL EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD darwiniensis VSNLPCGLNG ALYHVNMDED GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSIC SAVTPHVCDN LQQTRCQGTA CGENGGGSRF GSSCDPDGCD FNSWRMGNKT FYGPGLIVDT KSKFTVVTQF VGNPVTEIKR KYVQNGKVIE NSYSNIEGMD KFNSVSDKFC TAQKKAFGDT DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS DRGPCPTTSG VPADVESKSA DANVIYSDIR FGAIDSTYK 146197227 uncultured symbiotic MLGALVALAS CIGVGTNTPE KHPDLKWTNG GSSVSGSIVV DSNWRWTHIK GETKNCYDGN LWSDKYCPDA protist of Neotermes ATCGKNCVLE GADYSGTYGV TTSGDAATLK FVTHGQYSTN VGSRLYLLKD EKTYQMFNLV GKEFTFTVDV koshunensis SNLPCGLNGA LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GKYGSCCSEM DIWEANSMAT AYTPHVCDKL EQTRCSGSAC GQNGGGDRFS SSCDPDGCDF NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGGSVTEIK RKYVQGGKVI DNSMTNIAAM SKQYNSVSDE FCQAQKKAFG DNDSFTKHGG FRQLGATLSK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GADRGPCKTS SGVPSDVESQ NADSTVKYSD IRFGAIDSTY SK 146197253 uncultured symbiotic MLAAALFTFA CSVGVGTKTT ETHPKLNWQQ CACKGSCSQV SGEVTMDSNW RWTHDGNGKN CYDGNTWISS protist of Neotermes LCPDDKTCSD KCVLDGAEYQ ATYGIQSNGT ALTPKFVTHG SYSTNIGSRL YLLKDKSTYY VFQLNNKEFT koshunensis FSVDVSKLPC GLNGALYFVE MDADGGKSKY AGAKPGAEYG LGYCDAQCPS DLKFINGEAN SEGWKPQSGD KNAGNGKYGS CCSEMDVWES NSMATALTPH VCKTTGQTRC SGKSECGGQD GQDRFAGNCD EDGCDFNNWR MGDKTFFGPG LTVDTKSPFV VVTQFYGSPV TEIRRKYVQN GKVIENAKSN IPGIDATNAI SDTFCEQQKK AFGDTNDFKN KGGFTKLGSV FSRGMVLVLS LWDDHQVAML WLDSTYPTNK DKSVPGVDRG PCPTSSGKPD DVESASGDAT VVYGNIKFGA LDSTY 146197099 uncultured symbiotic MFGFLLSLFA LQFALEIGTQ TSESHPSITW ELNGARQSGQ IVIDSNWRWL HDSGTTNCYD GNTWSSDLCP protist of DPEKCSQNCY LEGADYSGTY GISASGSQLT LGFVTKGSYS TNIGSRVYLL KDENTYPMFK LKNKEFTFTV Reticulitermes DVSNLPCGLN GALYFVAMPS DGGKAKYPLA KPGAKYGMGY CDAQCPHDMK FINGEANVLD WKPQSNDENA speratus GTGRYGTCCT EMDIWEANSQ ATAYTVHACS KNARCEGTEC GDDSASQRYN GICDKDGCDF NSWRWGNKTF FGPGLTVDSS KPVTVVTQFI GDPLTEIRRI WVQGGKVIQN SFTNVSGITS VDSITNTFCD ESKVATGDTN DFKAKGGMSG FSKALDTEVV LVLSLWDDHT ANMLWLDSTY PTDSTAIGAS RGPCATSSGD PKDVESASAN ASVKFSDIKF GALDSTY 146197409 uncultured symbiotic MLASLLPLSN SLGTASNQAE THPKLTWTQY TGKGAGQTVN GEIVLDSNWR WTHKDGTNCY DGNTWSSSLC protist of Cryptocercus PDPTTCSNNC NLDGADYPGT YGITTSGNQL KLGFVTHGSY STNIGSRVYL LRDSKNYQMF KLKNKEFTFT punctulatus VDDSKLPCGL NGAVYFVAMD EDGGTAKHSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRWGARC TEMDIWEANS RATAYTPHIC TKTGLYRCEG TECGDSDTNR YGGVCDKDGC DFNSYRMGDK SFFGQGKTVD SSKPVTVVTQ FITDNNQDSG KLTEIRRKYV QGGKVIDNSK VNIAGITAGN PITDTFCDEA KKAFGDNNDF EKKGGLSALG TQLEAGFVLV LSLWDDHSVN MLWLDSTYPT NASPGALGVE RGDCAITSGV PADVESQSAD ASVTFSDIKF GPIDSTY 146197315 uncultured symbiotic MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD protist of Mastotermes AATCGKNCVL EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD darwiniensis VSNLPCGLSG ALYHVNMDED GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSIC SAVTPHVCDN LQQTRCQGAA CGENGGGSRF GSSCDPDGCD FNSWGMGNKT FYGPGLIVDT KSKFTVVTQF VGNPVTEIKR KYVQNGKVIE NSYSNIEGMD KFNSVSDKFC TAQKKAFGDT DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS DRGPCPTTSG VPADVESKSA DANVIYSDIR FGAIDSTYK 146197411 uncultured symbiotic MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC protist of Cryptocercus PDPTTCSNNC DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT punctulatus VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GILSETRRKY VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK FGPIDSTY 146197161 uncultured symbiotic MIGIVLIQTV FGIGVGTQQS ESHPSLSWQQ CSKGGSCTSV SGSIVLDSNW RWTHIPDGTT NCYDGNEWSS protist of DLCPDPTTCS NNCVLEGADY SGTYGISTSG SSAKLGFVTK GSYSTNIGSR VYLLGDESHY KIFDLKNKEF Hodotermopsis TFTVDDSNLE CGLNGALYFV AMDEDGGASR FTLAKPGAKY GTGYCDAQCP HDIKFINGEA NVQDWKPSDN sjoestedti DDNAGTGHYG ACCTEMDIWE ANKYATAYTP HICTENGEYR CEGKSCGDSS DDRYGGVCDK DGCDFNSWRL GNQSFWGPGL IIDTGKPVTV VTQFVTKDGT DSGALSEIRR KYVQGGKTIE NTVVKISGID EVDSITDEFC NQQKQAFGDT NDFEKKGGLS GLGKAFDYGV VLVLSLWDDH DVNMLWLDSV YPTNPAGKAG ADRGPCATSS GDPKEVEDKY ASASVTFSDI KFGPIDSTY 146197323 uncultured symbiotic MLVFGIVSFV YSIGVGTNTA ETHPKLTWKN GGSTTNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD protist of Mastotermes AATCGKNCVL EGADYSGTYG VTSSGDALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD darwiniensis VSQLPCGLNG ALYFVCMDQD GGMSRYPDNQ AGAKYGTGYC DAQCPTDLKF INGLPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSLA TAVTPHVCDQ VGQTRCEGRA CGENGGGDRF GSICDPDGCD FNSWRMGNKT FWGPGLIIDT KKPVTVVTQF IGSPVTEIKR EYVQGGKVIE NSYTNIEGMD KFNSISDKFC TAQKKAFGDN DSFTKHGGFS KLGQSFTKGQ VLVLSLWDDH TVNMLWLDSV YPTNSKKLGS DRGPCPTSSG VPADVESKNA DSSVKYSDIR FGSIDSTYK 146197077 uncultured symbiotic MLSFVFLLGF GVSLEIGTQQ SENHPTLSWQ QCTSSGSCTS QSGSIVLDSN WRWVHDSGTT NCYDGNEWSS protist of DLCPDPETCS KNCYLDGADY SGTYGITSNG SSLKLGFVTE GSYSTNIGSR VYLKKDTNTY QIFKLKNHEF Reticulitermes TFTVDVSNLP CGLNGALYFV EMEADGGKGK YPLAKPGAQY GMGYCDAQCP HDMKFINGNA NVLDWKPQET speratus DENSGNGRYG TCCTEMDIWE ANSQATAYTP HICTKDGQYQ CEGTECGDSD ANQRYNGVCD KDGCDFNSYR LGNKTFFGPG LIVDSKKPVT VVTQFITSNG QDSGDLTEIR RIYVQGGKTI QNSFTNIAGL TSVDSITEAF CDESKDLFGD TNDFKAKGGF TAMGKSLDTG VVLVLSLWDD HSVNMLWLDS TYPTDAAAGA LGTQRGPCAT SSGAPSDVES QSPDASVTFS DIKFGPLDST Y 146197089 uncultured symbiotic MLTLVVYLLS LVVSLEIGTQ QSESHPALTW QREGSSASGS IVLDSNWRWV HDSGTTNCYD GNEWSTDLCP protist of SSDTCTQKCY IEGADYSGTY GITTSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKNKEFTFTV Reticulitermes DDSKLDCGLN GALYFVAMDA DGGKQKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVED WKPQDNDENS speratus GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGT DCGDSDSRYQ GTCDKDGCDY ASYRWGDHSF YGEGKTVDTK QPITVVTQFI GDPLTEIRRL YIQGGKVINN SKTQNLASVY DSITDAFCDA TKAASGDTND FKAKGAMAGF SKNLDTPQVL VLSLWDDHTA NMLWLDSTYP TDSRDATAER GPCATSSGVP KDVESNQADA SVVFSDIKFG AINSTYSYN 146197091 uncultured symbiotic MFGFLLSLFA LQFALEIGTQ TSESHPSITW ELNGARQSGQ IVIDSNWRWL HDSGTTNCYD GNTWSSDLCP protist of DPEKCSQNCY LEGADYSGTY GISASGSQLT LGFVTKGSYS TNIGSRVYLL KDENTYQMFK LKNKEFTFTV Reticulitermes DVSNLPCGLN GALYFVAMPS DGGKAKYPLA KPGAKYGMGY CDAQCPHDMK FINGEANVLD WKPQSNDENA speratus GTGRYGTCCT EMDIWEANSQ ATAYTVHACS KNARCEGTEC GDDSASQRYN GICDKDGCDF NSWRWGNKTF FGPGLTVDSS KPVTVVTQFI GDPLTEIRRI WVQGGKVIQN SFTNVSGITS VDSITNTFCD ESKVATGDTN DFKAKGGMSG FSKALDTEVV LVLSLWDDHT ANMLWLDSTY PSNSTAIGAT RGPCATSSGD PKNVESASAN ASVKFSDIKF GAFDSTY 146197097 uncultured symbiotic MLALVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCP protist of SSDTCTSKCY IEGADYSGTY GITSSGSKVT LKFVTKGSYS TNIGSRIYLL KDENTYETFK LKNKEFTFTV Reticulitermes DDSQLNCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS speratus GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNTAQ VLVLSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVTSG VPKDVESQYG SAQVVYSDIK FGAINSTY 146197095 uncultured symbiotic MLALVYFLLS FVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCG protist of SSDTCSSKCY IEGADYSGTY GISASGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKGKEFTFTV Reticulitermes DDSKLDCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS speratus GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTID TKQPVTVVTQ FIGDPLTEIR RVYVQGGKVI NNSKTSNLAN VYDSITDKFC DDTKDATGDT NDFKAKGAMS GFSTNLNTAQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVLSG VPKNVESQHG DATVIYSDIK FGAINSTFSY N 146197401 uncultured symbiotic MFLALFVLGK SLGIATNQAE NHPKLTWTRY QSKGSGQTVN GEVVLDSNWR WTHHSGTNCY DGNTWSTSLC protist of Cryptocercus PDPQTCSSNC DLDGADYPGT YGISSSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT punctulatus VDDSKLPCGL NGALYFVAME EDGGVAKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC IEMDIWEANS MATAYTPHVC TVTGIHRCEG TECGDTDANQ RYNGICDKDG CDFNSYRMGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDG GTLSEIKRKY VQGGKVIENS KVNIAGITAV NSITDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDLGMVL VLSLWDDHSV NMLWLDSTYP TDAAAGALGT ERGACATSSG KPSDVESQSP DASVTFSDIK FGPIDSTY 146197225 uncultured symbiotic MLLCLLSIAN SLGVGTNTAE NHPKLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA protist of Neotermes ATCGKNCVIE GADYQGTYGV SSSGDGLTLT FVTHGQYSTN VGSRLYLMKD EKTYQMFNLN GKEFTFTVDV koshunensis SNLPCGLNGA LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GKYGSCCSEM DIWEANSQAT AYTPHVCDKL EQTRCSGSSC GHTGGGERFS SSCDPDGCDF NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGNSCTEIK RKYVQGGKVI DNSMSNIAGM SKQYNSVSDD FCQAQKKAFG DNDSFTKHGG FRQLGATLGK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GSDRGPCKTS SGIPADVESQ AASSSVKYSD IRFGAIDSTY K 146197317 uncultured symbiotic MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD protist of Mastotermes AATCGKNCVL EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLMK DEKTYQMFNL NGKEFTFTVD darwiniensis VSNLPCGLNG ALYHVNMDED GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSIC SAVTPHVCDT LQQTRCQGTA CGENGGGSRF GSSCDPDGCD FNSWRMGNKT FYGPGLIVDT KSKFTVVTQF VGSPVTEIKR KYVQNGKVIE NSFSNIEGMD KFNSISDKFC TAQKKAFGDT DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS DRGPCPTTSG VPADVESKSA NANVIYSDIR FGAIDSTYK 146197251 uncultured symbiotic MLLCLLGIAS SLDAGTNTAE NHPQLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA protist of Neotermes ATCGQNCVIE GADYQGTYGV SASGNALTLT FVTHGQYSTN VGSRLYLLKD EKTYQIFNLI GKEFTFTVDV koshunensis SNLPCGLNGA LYFVQMDADG GTAKYSDNKA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GRYGSCCSEM DVWEANSLAT AYTPHVCDKL EQVRCDGRAC GQNGGGDRFS SSCDPDGCDF NSWRLGNKTF WGPGLIVDTK QPVQVVTQWV GSGTSVTEIK RKYVQGGKVI DNSFTKLDSL TKQYNSVSDE FCVAQKKAFG DNDSFTKHGG FRQLGATLAK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GADRGPCKTS SGVPADVESQ AASSSVKYSD IRFGAIDSTY K 146197319 uncultured symbiotic MLGIGFVCIV YSLGVGTNTA ENHPKLTWKN SGSTTNGEVT VDSNWRWTHT KGTTKNCYDG NLWSKDLCPD protist of Mastotermes AATCGKNCVL EGADYSGTYG VTSSGDALTL KFVTHGSYST NVGSRLYLLK DEKTYQIFNL NGKEFTFTVD darwiniensis VSNLPCGLNG ALYFVNMDAD GGTGRYPDNQ AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSLA TAVTPHVCDQ VGQTRCEGRA CGENGGGDRF GSSCDPDGCD FNSWRLGNKT FWGPGLIVDT KKPVTVVTQF VGSPVTEIKR KYVQGGKVIE NSYTNIEGLD KFNSISDKFC TAQKKAFGDN DSFIKHGGFR QLGQSFTKGQ VLVLSLWDDH TVNMLWLDSV YPTNSKKPGA DRGPCPTSSG VPADVESKNA GSSVKYSDIR FGSIDSTYK 146197071 uncultured symbiotic MATLVGILVS LFALEVALEI GTQTSESHPS LSWELNGQRQ TGSIVIDSNW RWLHDSGTTN CYDGNEWSSD protist of LCPDPEKCSQ NCYLEGADYS GTYGISSSGN SLQLGFVTKG SYSTNIGSRV YLLKDENTYA TFKLKNKEFT Reticulitermes FTADVSNLPC GLNGALYFVA MPADGGKSKY PLAKPGAKYG MGYCDAQCPH DMKFINGEAN ILDWKPSSND speratus ENAGAGRYGT CCTEMDIWEA NSQATAYTVH ACSKNARCEG TECGDDDGRY NGICDKDGCD FNSWRWGNKT FFGPNLIVDS SKPVTVVTQF IGDPLTEIRR IYVQGGKVIQ NSFTNISGVA SVDSITDAFC NENKVATGDT NDFKAKGGMS GFSKALDTEV VLVLSLWDDH TANMLWLDST YPTDSSALGA SRGPCAITSG EPKDVESASA NASVKFSDIK FGAIDSTY 146197075 uncultured symbiotic MLTLVYFLLS LVVSLEIGTQ QSESHPQLSW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCP protist of SSDTCTSKCY IEGADYSGTY GITSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKNKEFTFTV Reticulitermes DDSKLDCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS speratus GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPLTVVTQ FVGDPLTEIR RVYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNTAQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVSSG VPKDVESQHG DATVIYSDIK FGAINSTFKW N 146197159 uncultured symbiotic MLSLVSIFLV GLGFSLGVGT QQSESHPSLS WQNCSAKGSC QSVSGSIVLD SNWRWLHDSG TTNCYDGNEW protist of STDLCPDAST CDKNCYIEGA DYSGTYGITS SGAQLKLGFV TKGSYSTNIG SRVYLLRDES HYQLFKLKNH Hodotermopsis EFTFTVDDSQ LPCGLNGALY FVEMAEDGGA KPGAQYGMGY CDAQCPHDMK FITGEANVKD WKPQETDENA sjoestedti GNGHYGACCT EMDIWEANSQ ATAYTPHICS KTGIYRCEGT ECGDNDANQR YNGVCDKDGC DFNSYRLGNK TFWGPGLTVD SNKAMIVVTQ FTTSNNQDSG ELSEIRRIYV QGGKTIQNSD TNVQGITTTN KITQAFCDET KVTFGDTNDF KAKGGFSGLS KSLESGAVLV LSLWDDHSVN MLWLDSTYPT DSAGKPGADR GPCAITSGDP KDVESQSPNA SVTFSDIKFG PIDSTY 146197405 uncultured symbiotic MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC protist of Cryptocercus PDPTTCSNNC DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LKDTKSYEMF KLKNKEFTFT punctulatus VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGFGAL SKQLVAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK FGPIDSTY 146197327 uncultured symbiotic MLCVGLFGLV YSIGVGTNTQ ETHPKLSWKQ CSSGGSCTTQ QGSVVIDSNW RWTHSTKDLT NCYDGNLWDS protist of Mastotermes TLCPDGTTCS KNCVLEGADY SGTYGITSSG DSLTLKFVTH GSYSTNVGSR LYLLKDDNNY QIFNLAGKEF darwiniensis TFTVDVSNLP CGLNGALYFV EMDQDGGKGK HKENEAGAKY GTGYCDAQCP TDLKFIDGIA NSDGWKPQDN DENSGNGKYG SCCSEMDIWE ANSLATAYTP HVCDTKGQKR CQGTACGENG GGDRFGSECD PDGCDFNSWR QGNKSFWGPG LIIDTKKSVQ VVTQFIGSGS SVTEIRRKYV QNGKVIENSY STISGTEKYN SISDDYCNAQ KKAFGDTNSF ENHGGFKRFS QHIQDMVLVL SLWDDHTVNM LWLDSVYPTN SNKPGADRGP CETSSGVPAD VESKSASASV KYSDIRFGPI DSTYK 146197261 uncultured symbiotic MLLCLWSIAY SLGVGTNTAE NHPKLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA protist of Neotermes ATCGKNCVIE GADYQGTYGV SASGDGLTLT FVTHGQYSTN VGSRLYLMKD EKTYQIFNLN GKEFTFTVDV koshunensis SNLPCGLNGA LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GKYGSCCSEM DIWEANSQAT AYTPHVCDKL EQTRCSGSAC GHTGGGERFS SSCDPDGCDF NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGNSCTEIK RKYVQGGKVI DNSMSNIAGM TKQYNSVSDD FCQAQKKAFG DNDSFTKHGG FRQLGATLGK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GSDRGPCKTS SGIPADVESQ AASSSVKYSD IRFGAIDSTY K

TABLE 2 Database Position Position Sequence Identifier Accession Corresponding to Corresponding to (SEQ ID NO:) Number Species of Origin Position 268 Position 411 BD29555* Unknown 273 422 340514556 Trichoderma reesei 268 411 51243029 Penicillium occitanis 273 422 7cel (PDB) & Trichoderma reesei 251 394 67516425 Aspergillus nidulans FGSC A4 274 424 46107376 Gibberella zeae PH-1 268 415 70992391 Aspergillus fumigatus Af293 277 427 121699984 Aspergillus clavatus NRRL 1 277 427 1906845 Claviceps purpurea 269 416 1gpi (PDB) & Phanerochaete chrysosporium 240 391 119468034 Neosartorya fischeri NRRL 181 265 414 7804883 Leptosphaeria maculans 256 401 85108032 Neurospora crassa N150 268 412 169859458 Coprinopsis cinerea okayama 270 421 154292161 Botryotinia fuckeliana B05-10 — 410 169615761 # Phaeosphaeria nodorum SN15 246 393 4883502 Humicola grisea 272 413 950686 Humicola grisea 270 416 124491660 Chaetomium thermophilum 272 413 58045187 Chaetomium thermophilum 270 416 169601100 # Phaeosphaeria nodorum SN15 237 383 169870197 Coprinopsis cinerea okayama 269 421 3913806 Agaricus bisporus 263 414 169611094 Phaeosphaeria nodorum SN15 270 414 3131 Phanerochaete chrysosporium — 410 70991503 Aspergillus fumigatus Af293 265 414 294196 Phanerochaete chrysosporium 258 409 18997123 Thermoascus aurantiacus 268 418 4204214 Humicola grisea var thermoidea 272 413 34582632 Trichoderma viride (also known as 268 411 Hypochrea rufa) 156712284 Thermoascus aurantiacus 268 418 39977899 Magnaporthe grisea (oryzae) 70-15 268 414 20986705 Talaromyces emersonii 266 416 22138843 Aspergillus oryzae 265 414 55775695 Penicillium chrysogenum 276 426 171676762 Podospora anserina 270 417 146350520 Pleurotus sp Florida 268 420 37732123 Gibberella zeae 268 415 156055188 Sclerotinia sclerotiorum 1980 — 410 453224 Phanerochaete chrysosporium 258 409 50402144 Trichoderma reesei 268 411 115397177 Aspergillus terreus NIH2624 274 424 154312003 Botryotinia fuckeliana B05-10 266 416 49333365 Volvariella volvacea 268 420 729650 Penicillium janthinellum 274 424 146424871 Pleurotus sp Florida 267 418 67538012 Aspergillus nidulans FGSC A4 265 410 62006162 Fusarium poae 268 415 146424873 Pleurotus sp Florida 267 418 295937 Trichoderma viride 268 411 6179889 # Alternaria alternata 240 386 119483864 Neosartorya fischeri NRRL 181 278 428 85083281 Neurospora crassa OR74A 270 412 3913803 Cryphonectria parasitica 269 416 60729633 Corticium rolfsii 265 415 39971383 Magnaporthe grisea 70-15 268 410 39973029 Magnaporthe grisea 70-15 269 410 1170141 Fusarium oxysporum 268 415 121710012 Aspergillus clavatus NRRL 1 265 414 17902580 Penicillium funiculosum 273 422 1346226 Humicola grisea var thermoidea 270 416 156712282 Chaetomium thermophilum 270 416 169768818 Aspergillus oryzae RIB40 277 427 46241270 Gibberella pulicaris 268 415 49333363 Volvariella volvacea 265 418 46395332 Irpex lacteus 263 414 50844407 # Chaetomium thermophilum var 245 391 thermophilum 4586347 Irpex lacteus 264 415 3980202 Phanerochaete chrysosporium 258 410 27125837 Melanocarpus albomyces 273 414 171696102 Podospora anserina 265 415 3913802 Cochliobolus carbonum 270 416 50403723 Trichoderma viride 268 411 3913798 Aspergillus aculeatus 275 425 66828465 Dictyostelium discoideum 269 419 156060391 Sclerotinia sclerotiorum 1980 252 402 116181754 Chaetomium globosum CBS 148-51 263 413 145230535 Aspergillus niger 274 424 46241266 Nectria haematococca mpVI 268 415 1q9h (PDB) # Talaromyces emersonii 248 398 157362170 Polyporus arcularius 269 420 7804885 Leptosphaeria maculans 267 407 121852 Phanerochaete chrysosporium 258 409 126013214 Penicillium decumbens 264 415 156048578 Sclerotinia sclerotiorum 1980 265 413 156712278 Acremonium thermophilum 269 414 21449327 Aspergillus nidulans 265 410 171683762 Podospora anserina 274 415 56718412 Thermoascus aurantiacus var 268 418 levisporus 15824273 Pseudotrichonympha grassii 263 414 115390801 Aspergillus terreus NIH2624 266 411 453223 Phanerochaete chrysosporium 258 409 3132 Phanerochaete chrysosporium — 407 16304152 Thermoascus aurantiacus 268 417 156712280 Acremonium thermophilum 273 420 5231154 Volvariella volvacea 281 438 116200349 Chaetomium globosum CBS 148-51 270 412 4586343 Irpex lacteus 263 414 15321718 Lentinula edodes — 417 146424875 Pleurotus sp Florida 267 418 62006158 Fusarium venenatum 268 415 296027 Phanerochaete chrysosporium 258 409 154449709 Fusicoccum sp BCC4124 272 424 169859460 Coprinopsis cinerea okayama 269 421 50400675 Trichoderma harzianum 264 407 729649 Neurospora crassa 262 406 119472134 Neosartorya fischeri NRRL 181 277 427 117935080 Chaetomium thermophilum 272 413 154300584 Botryotinia fuckeliana B05-10 265 413 15824271 Pseudotrichonympha grassii 263 414 4586345 Irpex lacteus 263 414 46241268 Gibberella avenacea 268 416 6164684 Aspergillus niger 274 424 6164682 Aspergillus niger 266 412 33733371 Chrysosporium lucknowense 269 415 US6573086-10 29160311 Thielavia australiensis 269 415 146197087 uncultured symbiotic protist of 260 402 Reticulitermes speratus 146197237 uncultured symbiotic protist of 264 409 Neotermes koshunensis 146197067 uncultured symbiotic protist of 260 402 Reticulitermes speratus 146197407 uncultured symbiotic protist of 261 412 Cryptocercus punctulatus 146197157 uncultured symbiotic protist of 264 410 Hodotermopsis sjoestedti 146197403 uncultured symbiotic protist of 261 412 Cryptocercus punctulatus 146197081 uncultured symbiotic protist of 260 410 Reticulitermes speratus 146197413 uncultured symbiotic protist of 261 412 Cryptocercus punctulatus 146197309 uncultured symbiotic protist of 259 402 Mastotermes darwiniensis 146197227 uncultured symbiotic protist of 258 404 Neotermes koshunensis 146197253 uncultured symbiotic protist of 264 409 Neotermes koshunensis 146197099 uncultured symbiotic protist of 258 401 Reticulitermes speratus 146197409 uncultured symbiotic protist of 260 411 Cryptocercus punctulatus 146197315 uncultured symbiotic protist of 259 402 Mastotermes darwiniensis 146197411 uncultured symbiotic protist of 261 412 Cryptocercus punctulatus 146197161 uncultured symbiotic protist of 263 413 Hodotermopsis sjoestedti 146197323 uncultured symbiotic protist of 259 402 Mastotermes darwiniensis 146197077 uncultured symbiotic protist of 264 415 Reticulitermes speratus 146197089 uncultured symbiotic protist of 258 400 Reticulitermes speratus 146197091 uncultured symbiotic protist of 258 401 Reticulitermes speratus 146197097 uncultured symbiotic protist of 260 402 Reticulitermes speratus 146197095 uncultured symbiotic protist of 260 402 Reticulitermes speratus 146197401 uncultured symbiotic protist of 261 412 Cryptocercus punctulatus 146197225 uncultured symbiotic protist of 258 404 Neotermes koshunensis 146197317 uncultured symbiotic protist of 259 402 Mastotermes darwiniensis 146197251 uncultured symbiotic protist of 258 404 Neotermes koshunensis 146197319 uncultured symbiotic protist of 259 402 Mastotermes darwiniensis 146197071 uncultured symbiotic protist of 259 402 Reticulitermes speratus 146197075 uncultured symbiotic protist of 260 402 Reticulitermes speratus 146197159 uncultured symbiotic protist of 260 410 Hodotermopsis sjoestedti 146197405 uncultured symbiotic protist of 261 412 Cryptocercus punctulatus 146197327 uncultured symbiotic protist of 264 408 Mastotermes darwiniensis 146197261 uncultured symbiotic protist of 258 404 Neotermes koshunensis

TABLE 3 Signal Catalytic Cellulose Database Sequence (SS) Domain (CD) Linker Start Binding Accession Start and End Start and End and End Domain (CBD) SEQ ID NO: Number Species of Origin Position Position Position Start and End BD29555* Unknown 1-25 26-455 456-493 494-529 340514556 Trichoderma reesei 1-17 18-444 445-479 480-514 51243029 Penicillium occitanis 1-25 26-455 456-493 494-529 7cel (PDB) & Trichoderma reesei N/A  1-427 N/A N/A 67516425 Aspergillus nidulans 1-23 24-457 458-490 491-526 FGSC A4 46107376 Gibberella zeae PH-1 1-17 18-448 449-476 477-512 70992391 Aspergillus fumigatus 1-26 27-460 461-496 497-532 Af293 121699984 Aspergillus clavatus 1-27 27-460 461-503 504-539 NRRL 1 1906845 Claviceps purpurea 1-19 20-449 N/A N/A 1gpi (PDB) & Phanerochaete N/A  1-424 N/A N/A chrysosporium 119468034 Neosartorya fischeri 1-17 18-447 N/A N/A NRRL 181 7804883 Leptosphaeria 1-17 18-434 N/A N/A maculans 85108032 Neurospora crassa 1-17 18-445 446-485 486-521 N150 169859458 Coprinopsis cinerea 1-18 19-454 N/A N/A okayama 154292161 Botryotinia fuckeliana 1-18 19-443 444-555 556-596 B05-10 169615761 # Phaeosphaeria 1  2-426 N/A N/A nodorum SN15 4883502 Humicola grisea 1-22 23-446 N/A N/A 950686 Humicola grisea 1-18 19-449 450-489 490-525 124491660 Chaetomium 1-22 23-446 N/A N/A thermophilum 58045187 Chaetomium 1-18 19-449 450-494 495-530 thermophilum 169601100 # Phaeosphaeria 1 2-416 N/A N/A nodorum SN15 169870197 Coprinopsis cinerea 1-18 19-454 N/A N/A okayama 3913806 Agaricus bisporus 1-18 19-447 448-470 471-506 169611094 Phaeosphaeria 1-18 19-447 N/A N/A nodorum SN15 3131 Phanerochaete 1-19 20-443 N/A N/A chrysosporium 70991503 Aspergillus fumigatus 1-17 18-447 N/A N/A Af293 294196 Phanerochaete 1-18 19-442 443-480 481-516 chrysosporium 18997123 Thermoascus 1-17 18-451 N/A N/A aurantiacus 4204214 Humicola grisea var 1-22 23-446 N/A N/A thermoidea 34582632 Trichoderma viride 1-18 18-444 445-479 480-514 (also known as Hypochrea rufa) 156712284 Thermoascus 1-17 18-451 N/A N/A aurantiacus 39977899 Magnaporthe grisea 1-17 18-447 N/A N/A (oryzae) 70-15 20986705 Talaromyces emersonii 1-18 19-449 N/A N/A 22138843 Aspergillus oryzae 1-17 18-447 N/A N/A 55775695 Penicillium 1-25 26-459 460-494 495-529 chrysogenum 171676762 Podospora anserina 1-18 19-450 451-492 493-528 146350520 Pleurotus sp Florida 1-18 19-453 N/A N/A 37732123 Gibberella zeae 1-17 18-448 449-476 477-512 156055188 Sclerotinia 1-18 19-443 444-546 547-586 sclerotiorum 1980 453224 Phanerochaete 1-18 19-442 443-474 475-510 chrysosporium 50402144 Trichoderma reesei 1-17 18-444 445-478 479-513 115397177 Aspergillus terreus 1-23 24-457 458-505 506-541 NIH2624 154312003 Botryotinia fuckeliana 1-17 18-449 450-480 481-516 B05-10 49333365 Volvariella volvacea 1-18 19-453 N/A N/A 729650 Penicillium 1-25 26-456 457-502 503-537 janthinellum 146424871 Pleurotus sp Florida 1-18 19-451 452-487 488-523 67538012 Aspergillus nidulans 1-17 18-443 N/A N/A FGSC A4 62006162 Fusarium poae 1-17 18-448 449-475 476-511 146424873 Pleurotus sp Florida 1-18 19-451 452-487 488-523 295937 Trichoderma viride 1-17 18-444 445-478 479-513  6179889 # Alternaria alternata 1 2-419 N/A N/A 119483864 Neosartorya fischeri 1-26 27-461 462-499 500-535 NRRL 181 85083281 Neurospora crassa 1-20 21-445 N/A N/A OR74A 3913803 Cryphonectria 1-18 19-449 N/A N/A Parasitica 60729633 Corticium rolfsii 1-18 19-448 449-492 493-528 39971383 Magnaporthe grisea 1-17 18-443 N/A N/A 70-15 39973029 Magnaporthe grisea 1-19 20-443 N/A N/A 70-15 1170141 Fusarium oxysporum 1-17 18-448 449-478 479-514 121710012 Aspergillus clavatus 1-17 18-447 N/A N/A NRRL 1 17902580 Penicillium 1-25 26-455 456-493 494-529 funiculosum 1346226 Humicola grisea var 1-18 19-449 450-489 490-525 thermoidea 156712282 Chaetomium 1-18 19-449 450-496 497-532 thermophilum 169768818 Aspergillus oryzae 1-25 26-460 N/A N/A RIB40 46241270 Gibberella pulicaris 1-17 18-448 449-474 475-510 49333363 Volvariella volvacea 1-18 19-451 452-476 477-512 46395332 Irpex lacteus 1-18 19-447 448-485 486-521 50844407 # Chaetomium N/A  1-424 425-469 470-505 thermophilum var thermophilum 4586347 Irpex lacteus 1-18 19-448 449-490 491-526 3980202 Phanerochaete 1-18 19-443 444-475 476-511 chrysosporium 27125837 Melanocarpus 1-23 23-447 N/A N/A albomyces 171696102 Podospora anserina 1-17 17-448 N/A N/A 3913802 Cochliobolus 1-18 19-449 N/A N/A carbonum 50403723 Trichoderma viride 1-17 18-444 445-479 480-514 3913798 Aspergillus aculeatus 1-22 23-458 459-505 506-540 66828465 Dictyostelium 1-19 20-452 N/A N/A discoideum 156060391 Sclerotinia 1-17 18-435 436-470 471-504 sclerotiorum 1980 116181754 Chaetomium globosum 1-17 18-446 N/A N/A CBS 148-51 145230535 Aspergillus niger 1-21 22-457 458-500 501-536 46241266 Nectria haematococca 1-18 18-448 449-472 473-508 mpVI 1q9h (PDB) # Talaromyces emersonii N/A  1-431 N/A N/A 157362170 Polyporus arcularius 1-18 19-453 N/A N/A 7804885 Leptosphaeria 1-20 21-440 N/A N/A maculans 121852 Phanerochaete 1-18 19-442 443-480 481-516 chrysosporium 126013214 Penicillium decumbens 1-17 18-448 N/A N/A 156048578 Sclerotinia 1-16 17-446 N/A N/A sclerotiorum 1980 156712278 Acremonium 1-17 18-447 448-487 488-523 thermophilum 21449327 Aspergillus nidulans 1-17 18-443 N/A N/A 171683762 Podospora anserina 1-22 23-448 N/A N/A 56718412 Thermoascus 1-17 18-451 N/A N/A aurantiacus var levisporus 15824273 Pseudotrichonympha 1-20 21-447 N/A N/A grassii 115390801 Aspergillus terreus 1-17 18-444 N/A N/A NIH2624 453223 Phanerochaete 1-18 19-442 443-474 475-510 chrysosporium 3132 Phanerochaete 1-19 20-436 437-467 468-504 chrysosporium 16304152 Thermoascus 1-17 18-450 N/A N/A aurantiacus 156712280 Acremonium 1-21 22-453 N/A N/A thermophilum 5231154 Volvariella volvacea 1-15 16-472 473-500 501-536 116200349 Chaetomium globosum 1-20 21-445 N/A N/A CBS 148-51 4586343 Irpex lacteus 1-18 19-447 448-481 482-517 15321718 Lentinula edodes 1-18 19-450 451-480 481-516 146424875 Pleurotus sp Florida 1-18 19-451 452-487 488-523 62006158 Fusarium venenatum 1-17 18-448 449-471 472-507 296027 Phanerochaete 1-18 19-442 443-480 481-516 chrysosporium 154449709 Pusicoccum sp 1-19 20-457 N/A N/A BCC4124 169859460 Coprinopsis cinerea 1-18 19-454 N/A N/A okayama 50400675 Trichoderma 1-17 18-440 441-470 471-505 harzianum 729649 Neurospora crassa 1-17 18-439 440-480 481-516 119472134 Neosartorya fischeri 1-26 27-460 461-494 495-530 NRRL 181 117935080 Chaetomium 1-22 23-446 N/A N/A thermophilum 154300584 Botryotinia fuckeliana 1-16 17-446 N/A N/A B05-10 15824271 Pseudotrichonympha 1-20 21-447 N/A N/A grassii 4586345 Irpex lacteus 1-18 19-447 448-487 488-523 46241268 Gibberella avenacea 1-17 18-449 450-478 478-513 6164684 Aspergillus niger 1-21 22-457 458-500 501-536 6164682 Aspergillus niger 1-17 18-445 N/A N/A 33733371 Chrysosporium 1-17 18-448 449-490 491-526 lucknowense US6573086-10 29160311 Thielavia australiensis 1-18 18-448 449-502 503-538 146197087 uncultured symbiotic 1-22 23-435 N/A N/A protist of Reticulitermes speratus 146197237 uncultured symbiotic 1-20 21-442 N/A N/A protist of Neotermes koshunensis 146197067 uncultured symbiotic 1-22 23-435 N/A N/A protist of Reticulitermes speratus 146197407 uncultured symbiotic 1-19 20-445 N/A N/A protist of Cryptocercus punctulatus 146197157 uncultured symbiotic 1-20 21-443 N/A N/A protist of Hodotermopsis sjoestedti 146197403 uncultured symbiotic 1-19 20-445 N/A N/A protist of Cryptocercus punctulatus 146197081 uncultured symbiotic 1-22 23-443 N/A N/A protist of Reticuhtermes speratus 146197413 uncultured symbiotic 1-19 20-445 N/A N/A protist of Cryptocercus punctulatus 146197309 uncultured symbiotic 1-20 21-435 N/A N/A protist of Mastotermes darwiniensis 146197227 uncultured symbiotic 1-19 20-437 N/A N/A protist of Neotermes koshunensis 146197253 uncultured symbiotic 1-21 21-442 N/A N/A protist of Neotermes koshunensis 146197099 uncultured symbiotic 1-22 23-434 N/A N/A protist of Rehculitermes speratus 146197409 uncultured symbiotic 1-19 20-444 N/A N/A protist of Cryptocercus punctulatus 146197315 uncultured symbiotic 1-20 21-435 N/A N/A protist of Mastotermes darwiniensis 146197411 uncultured symbiotic 1-19 20-445 N/A N/A protist of Cryptocercus Punctulatus 146197161 uncultured symbiotic 1-20 21-446 N/A N/A protist of Hodotermopsis sjoestedti 146197323 uncultured symbiotic 1-20 21-435 N/A N/A protist of Mastotermes darwiniensis 146197077 uncultured symbiotic 1-21 22-448 N/A N/A protist of Reticuhtermes speratus 146197089 uncultured symbiotic 1-22 23-433 N/A N/A protist of Reticuhtermes speratus 146197091 uncultured symbiotic 1-22 23-434 N/A N/A protist of Reticuhtermes speratus 146197097 uncultured symbiotic 1-22 23-435 N/A N/A protist of Reticuhtermes speratus 146197095 uncultured symbiotic 1-22 23-435 N/A N/A protist of Reticuhtermes speratus 146197401 uncultured symbiotic 1-19 20-445 N/A N/A protist of Cryptocercus Punctulatus 146197225 uncultured symbiotic 1-19 20-437 N/A N/A protist of Neotermes koshunensis 146197317 uncultured symbiotic 1-20 21-435 N/A N/A protist of Mastotermes darwiniensis 146197251 uncultured symbiotic 1-19 20-437 N/A N/A protist of Neotermes koshunensis 146197319 uncultured symbiotic 1-20 21-435 N/A N/A protist of Mastotermes darwiniensis 146197071 uncultured symbiotic 1-25 26-435 N/A N/A protist of Reticulitermes speratus 146197075 uncultured symbiotic 1-22 23-435 N/A N/A protist of Reticulitermes speratus 146197159 uncultured symbiotic 1-23 24-443 N/A N/A protist of Hodotermopsis sjoestedti 146197405 uncultured symbiotic 1-19 20-445 N/A N/A protist of Cryptocercus punctulatus 146197327 uncultured symbiotic 1-20 21-441 N/A N/A protist of Mastotermes darwiniensis 146197261 uncultured symbiotic 1-19 20-437 N/A N/A protist of Neotermes koshunensis

TABLE 4 Amino Acid Amino Acid Position of Positions of Positions of Active Catalytic Sequence Database Fragment in Site Loop in Residues in Identifier Accession Amino Acid Sequence of Fragment of Catalytic Domain Sequence Sequence Sequence (SEQ ID NO:) Number Species of Origin Including Loop and Catalytic Residue Identifier Identifier Identifier BD29555* Unknown NVEG WTPSSNNANTGLG NHGACCA E LDIW E ANS 210-242 214-226 234, 239 340514556 Trichoderma reesei NVEG WTPSANNANTGIG NHGACCA E LDIW E ANS 205-237 209-221 229, 234 51243029 Penicillium occitanis NVEG WEPSSNNANTGIG GHGSCCS E MDIW E ANS 210-242 214-226 234, 239 7cel (PDB) & Trichoderma reesei NVEG WEPSSNNANTGIG GHGSCCS E MDIW Q ANS 188-220 192-204 212, 217 67516425 Aspergillus nidulans NVEG WESSDTNPNGGVG NHGSCCA E MDIW E ANS 211-243 215-227 235, 240 FGSC A4 46107376 Gibberella zeae PH-1 NSDG WQPSDSDVNGGIG NLGTCCP E MDIW E ANS 205-237 209-221 229, 234 70992391 Aspergillus fumigatus NVEG WQPSSNDANAGTG NHGSCCA E MDIW E ANS 214-246 218-230 238, 243 Af293 121699984 Aspergillus clavatus NVEG WTPSSSDANAGNG GHGSCCA E MDIW E ANS 214-246 218-230 238, 243 NRRL 1 1906845 Claviceps purpurea NSKD WIPSKSDANAGIG SLGACCR E MDIW E ANN 206-238 210-222 230, 235 1gpi (PDB) & Phanerochaete NVGN WTETG-SNTGTG SYGTCCS E MDIW E ANN 185-215 189-199 207, 212 chrysosporium 119468034 Neosartorya fischeri NVEG WKPSSNDKNAGVG GHGSCCP E MDIW E ANS 202-234 206-218 226, 231 NRRL 181 7804883 Leptosphaeria NVEG WQPSKNDQNAGVG GHGSCCA E MDIW E ANS 193-225 197-209 217, 222 maculans 85108032 Neurospora crassa NVEG WTPSTNDANAGIG DHGTCCS E MDIW E ANK 205-237 209-221 229, 234 N150 (OR74A) 169859458 Coprinopsis cinerea NSAD WTPSETDPNAGRG RYGICCA E MDIW E ANS 207-239 211-223 231, 236 okayama 154292161 Botryotinia NVEG WVPDSNSANSGTG NIGSCCS E FDVW E ANS 203-235 207-219 227, 232 fuckeliana B05-10 169615761 # Phaeosphaeria NADG WQASTSDPNAGVG KKGACCA E MDVW E ANS 183-215 187-199 207, 212 nodorum SN15 4883502 Humicola grisea NIEG WRPSTNDPNAGVG PMGACCA E IDVW E SNA 208-240 212-224 232, 237 950686 Humicola grisea NIEG WTGSTNDPNAGAG RYGTCCS E MDIW E ANN 207-239 211-223 231, 236 124491660 Chaetomium NIEG WRPSTNDANAGVG PYGACCA E IDVW E SNA 209-241 213-225 233, 238 thermophilum 58045187 Chaetomium NIEN WTPSTNDANAGFG RYGSCCS E MDIW E ANN 207-239 211-223 231, 236 thermophilum 169601100 # Phaeosphaeria NVEG WKPSDNDANAGVG GHGSCCA E MDIW E ANS 174-206 178-190 198, 203 nodorum SN15 169870197 Coprinopsis cinerea NSVG WEPSETDSNAGRG RYGICCA E MDIW E ANS 207-239 211-223 231, 236 okayama 3913806 Agaricus bisporus NSEG WEGSPNDVNAGTG NFGACCG E MDIW E ANS 203-235 207-219 227, 232 169611094 Phaeosphaeria NVEG WNPSDADPNAGSG KIGACCP E MDIW E ANS 208-240 212-224 232, 237 nodorum SN15 3131 Phanerochaete NVQG WNATS--ATTGTG SYGSCCT E LDIW E ANS 204-234 208-218 226, 231 chrysosporium 70991503 Aspergillus fumigatus NVEG WEPSSSDKNAGVG GHGSCCP E MDIW E ANS 202-234 206-218 226, 231 Af293 294196 Phanerochaete NVEG WNATS--ANAGTG NYGTCCT E MDIW E ANN 203-233 207-217 225, 230 chrysosporium 18997123 Thermoascus NVEG WQPSANDPNAGVG NHGSSCA E MDVW E ANS 205-237 209-221 229, 234 aurantiacus 4204214 Humicola grisea var NIEG WRPSTNDPNAGV GPMGACCA E IDVW E SNA 208-240 212-224 232, 237 thermoidea 34582632 Trichoderma viride NVEG WEPSSNNANTGIG GHGSCCS E MDIW E ANS 205-237 209-221 229, 234 (also known as Hypochrea rufa) 156712284 Thermoascus NVEG WQPSANDPNAGVG NHGSCCA E MDVW E ANS 205-237 209-221 229, 234 aurantiacus 39977899 Magnaporthe grisea NVEG WQPSSGDANSGVG NMGSCCA E MDIW E ANS 205-237 209-221 229, 234 (oryzae) 70-15 20986705 Talaromyces NVEG WQPSSNNANTGIG DHGSCCA E MDVW E ANS 203-235 207-219 227, 232 emersonii 22138843 Aspergillus oryzae R-KG WEPSDSDKNAGVG GHGSCCPQMDIW E ANS 203-234 206-218 226, 231 55775695 Penicillium NVEG WEPSSSDVNGGTG NYGSCCA E MDIW E ANS 213-245 217-229 237, 242 chrysogenum 171676762 Podospora anserina NIEG WNPSTNDVNAGAG RYGTCCS E MDIW E ANN 207-239 211-223 231, 236 146350520 Pleurotus sp Florida NVQG WQPSPNDSNAGKG QYGSCCA E MDIW E ANS 207-239 211-223 231, 236 37732123 Gibberella zeae NSDG WQPSDSDVNGGIG NLGTCCP E MDIW E ANS 205-237 209-221 229, 234 156055188 Sclerotinia NNEG WVPDSNSANSGTG NIGSCCS E FDVW E ANS 203-235 207-219 227, 232 sclerotiorum 1980 453224 Phanerochaete NVGN WTETG--SNTGTG SYGTCCS E MDIW E ANN 203-233 207-217 225, 230 chrysosporium 50402144 Trichoderma reesei NVEG WEPSSNNANTGIG GHGSCCS E MDIW E ANS 205-237 209-221 229, 234 115397177 Aspergillus terreus NVEG WEPSANDANAGTG NHGSCCA E MDIW E ANS 211-243 215-227 235, 240 NIH2624 154312003 Botryotinia NSVG WTPSSNDVNAGAG QYGSCCS E MDIW E ANK 206-238 210-222 230, 235 fuckeliana B05-10 49333365 Volvariella volvacea NVQG WQPSPNDTNAGTG NYGACCN E MDVW E ANS 207-239 211-223 231, 236 729650 Penicillium NVDG WTPSKNDVNSGIG NHGSCCA E MDIW E ANS 211-243 215-227 235, 240 janthinellum 146424871 Pleurotus sp Florida NILD WSASATDANAGNG RYGACCA E MDIW E ANS 206-238 210-222 230, 235 67538012 Aspergillus nidulans NVEG WEPSDSDANAGVG GMGTCCP E MDIW E ANS 202-234 206-218 226, 231 FGSC A4 62006162 Fusarium poae NSDG WEPSKSDVNGGIG NLGTCCP E MDIW E ANS 205-237 209-221 229, 234 146424873 Pleurotus sp Florida NILD WSGSATDPNAGNG RYGACCA E MDIW E ANS 206-238 210-222 230, 235 295937 Trichoderma viride NVEG WEPSSNNANTGIG GHGSCCS E MDIW E ANS 205-237 209-221 229, 234 6179889 # Alternaria alternata NVEG WKPSSNDANAGVG GHGSCCA E MDIW E ANS 177-209 181-193 201, 206 119483864 Neosartorya fischeri NVEG WTPSSNNENTGLG NYGSCCA E LDIW E SNS 215-247 219-231 239, 244 NRRL 181 85083281 Neurospora crassa NIEG WTPSTNDANAGVG PYGGCCA E IDVW E SNA 207-239 211-223 231, 236 OR74A 3913803 Cryphonectria NVEG WTPSTNDANAGVG GLGSCCS E MDVW E ANS 206-238 210-222 230, 235 parasitica 60729633 Corticium rolfsii NLLD WNATS--ANSGTG SYGSCCP E MDIW E ANK 206-236 210-220 228, 233 39971383 Magnaporthe grisea NIEG WQPSSTDSSAGIG AQGACCA E IDIW E SNK 205-237 209-221 229, 234 70-15 39973029 Magnaporthe grisea NIEG WKPSSNDANAGVG PYGACCA E IDVW E SNA 206-238 210-222 230, 235 70-15 1170141 Fusarium oxysporum NSEG WKPSDSDVNAGVG NLGTCCP E MDIW E ANS 205-237 209-221 229, 234 121710012 Aspergillus clavatus NVEG WKPSDNDKNAGVG GYGSCCP E MDIW E ANS 202-234 206-218 226, 231 NRRL 1 17902580 Penicillium NVEG WTPSTNNSNTGIG NHGSCCA E LDIW E ANS 210-242 214-226 234, 239 funiculosum 1346226 Humicola grisea var NIEG WTGSTNDPNAGAG RYGTCCS E MDIW E ANN 207-239 211-223 231, 236 thermoidea 156712282 Chaetomium NVGN WTPSTNDANAGFG RYGSCCS E MDVW E ANN 207-239 211-223 231, 236 thermophilum 169768818 Aspergillus oryzae NVEG WVSSTNNANTGTG NHGSCCA E LDIW E SNS 214-246 218-230 238, 243 RIB40 46241270 Gibberella pulicaris NSDG WQPSKSDVNAGIG NMGTCCP E MDIW E ANS 205-237 209-221 229, 234 49333363 Volvariella volvacea NVAG WNGSPNDTNAGTG NWGACCN E MDIW E ANS 205-237 209-221 229, 234 46395332 Irpex lacteus NVAG WTGSSSDPNSGTG NYGTCCS E MDIW E ANS 202-234 206-218 226, 231 50844407 # Chaetomium NIEN WTPSTNDANAGFG RYGSCCS E MDIW E ANN 182-214 186-198 206, 211 thermophilum var thermophilum 4586347 Irpex lacteus NIVD WTASAGDANSGTG SFGTCCQ E MDIW E ANS 203-235 207-219 227, 232 3980202 Phanerochaete NVGN WTETG--SNTGTG SYGTCCS E MDIW E ANN 203-233 207-217 225, 230 chrysosporium 27125837 Melanocarpus NIEG WKSSTSDPNAGVG PYGSCCA E IDVW E SNA 210-242 214-226 234, 239 albomyces 171696102 Podospora anserina NVEG WGGAD--GNSGTG KYGICCA E MDIW E ANS 206-236 210-220 228, 233 3913802 Cochliobolus NVEG WNPSDADPNGGAG KIGACCP E MDIW E ANS 208-240 212-224 232, 237 carbonum 50403723 Trichoderma viride NVEG WEPSSNNANTGIG GHGSCCS E MDIW E ANS 205-237 209-221 229, 234 3913798 Aspergillus aculeatus NIEG WEPSSTDVNAGTG NHGSCCP E MDIW E ANS 210-242 214-226 234, 239 66828465 Dictyostelium NVDG WIPSTNNPNTGYG NLGSCCA E MDLW E ANN 206-238 210-222 230, 235 discoideum 156060391 Sclerotinia NSVG WTPSSNDVNTGTG QYGSCCS E MDIW E ANK 192-224 196-208 216, 221 sclerotiorum 1980 116181754 Chaetomium NSEG WGGED--GNSGTG KYGTCCA E MDIW E ANL 203-233 207-217 225, 230 globosum CBS 148- 51 145230535 Aspergillus niger NCDG WEPSSNNVNTGVG DHGSCCA E MDVW E ANS 209-241 213-225 233, 238 46241266 Nectria NSDE WKPSDSDKNAGVG KYGTCCP E MDIW E ANK 205-237 209-221 229, 234 haematococca mpVI 1q9h (PDB) # Talaromyces NVEG WQPSSNNANTGIG DHGSCCA E MDVW E ANS 185-217 189-201 209, 214 emersonii 157362170 Polyporus arcularius NVLD WAGSSNDPNAGTG HYGTCCN E MDIW E ANS 208-240 212-224 232, 237 7804885 Leptosphaeria NAEG WTKSASDPNSGVG KKGACCAQMDVW E ANS 204-236 208-220 228, 233 maculans 121852 Phanerochaete NVEG WNATS--ANAGTG NYGTCCT E MDIW E ANN 203-233 207-217 225, 230 chrysosporium 126013214 Penicillium NVEG WKPSANDKNAGVG PHGSCCA E MDIW E ANS 201-233 205-217 225, 230 decumbens 156048578 Sclerotinia NVDG WVPSSNNPNTGVG NYGSCCA E MDIW E ANS 202-234 206-218 226, 231 sclerotiorum 1980 156712278 Acremonium NIDG WQPSSNDANAGLG NHGSCCS E MDIW E ANK 206-238 210-222 230, 235 thermophilum 21449327 Aspergillus nidulans NVEG WEPSDSDANAGVG GMGTCCP E MDIW E ANS 202-234 206-218 226, 231 (also known as Emericella nidulans) 171683762 Podospora anserine NIE GWRESSNDENAGVG PYGGCCA E IDVW E SNA 211-243 215-227 235, 240 (S mat+) 56718412 Thermoascus NVEG WQPSANDPNAGVG NHGSCCA E MDVW E ANS 205-237 209-221 229, 234 aurantiacus var levisporus 15824273 Pseudotrichonympha NVEN WKPQTNDENAGNG RYGACCT E MDIW E ANK 200-232 204-216 224, 229 grassii 115390801 Aspergillus terreus NVEG WTPSDNDKNAGVG GHGSCCP E LDIW E ANS 203-235 207-219 227, 232 NIH2624 453223 Phanerochaete NVGN WTETG--SNTGTG SYGTCCS E MDIW E ANN 203-233 207-217 225, 230 chrysosporium 3132 Phanerochaete NVEG WLGTT--ATTGTG FFGSCCTDIALW E AND 202-232 206-216 224, 229 chrysosporium 16304152 Thermoascus NVEG WQPSANDPNAGVG NHGSSCA E MDVW E ANS 205-237 209-221 229, 234 aurantiacus 156712280 Acremonium NSAS WQPSSNDQNAGVG GMGSCCA E MDIW E ANS 210-242 214-226 234, 239 thermophilum 5231154 Volvariella volvacea NVQG WQPSPNDTNAGTG NYGACCNKMDVW E ANS 220-252 224-236 244, 249 116200349 Chaetomium NYDG WTPSSNDANAGVG ALGGCCA E IDVW E SNA 207-239 211-223 231, 236 globosum CBS 148- 51 4586343 Irpex lacteus NVAG WAGSASDPNAGSG TLGTCCS E MDIW E ANN 202-234 206-218 226, 231 15321718 Lentinula edodes NVEG WTPSSTSPNAGTG GTGICCN E MDIW E ANS 208-240 212-224 232, 237 146424875 Pleurotus sp Florida NVLD WSASATDDNAGNG RYGACCA E MDIW E ANS 206-238 210-222 230, 235 62006158 Fusarium venenatum NSDG WQPSKSDVNGGIG NLGTCCP E MDIW E ANS 205-237 209-221 229, 234 296027 Phanerochaete NVEG WNATS--ANAGTG NYGTCCT E MDIW E ANN 203-233 207-217 225, 230 chrysosporium 154449709 Fusicoccum sp NVQN WTASSTDKNAGTG HYGSCCN E MDIW E ANS 209-241 213-225 233, 238 BCC4124 169859460 Coprinopsis cinerea NSVG WEPSETDPNAGKG QYGICCA E MDIW E ANS 207-239 211-223 231, 236 okayama 50400675 Trichoderma NVEG WEPSSNNANTGVG GHGSCCS E MDIW E ANS 201-233 205-217 225, 230 harzianum (anamorph of Hypocrea lixii) 729649 Neurospora crassa NVEG WTPSTNDAN-GIG DHGSCCS E MDIW E ANK 200-231 204-215 223, 228 (OR74A) 119472134 Neosartorya fischeri NVEG WQPSSNDANAGTG NHGSCCA E MDIW E ANS 214-246 218-230 238, 243 NRRL 181 117935080 Chaetomium NIEG WRPSTNDANAGVG PYGACCA E IDVW E SNA 209-241 213-225 233, 238 thermophilum 154300584 Botryotinia NVDG WVPSSNNANTGVG NHGSCCA E MDIW E ANS 202-234 206-218 226, 231 fuckeliana B05-10 15824271 Pseudotrichonympha NVEN WKPQTNDENAGNG RYGACCT E MDIW E ANK 200-232 204-216 224, 229 grassii 4586345 Irpex lacteus NVEG WTGSSTDSNSGTG NYGTCCS E MDIW E ANS 202-234 206-218 226, 231 46241268 Gibberella avenacea NSDG WKPSDSDINAGIG NMGTCCP E MDIW E ANS 205-237 209-221 229, 234 6164684 Aspergillus niger NCDG WEPSSNNVNTGVG DHGSCCA E MDVW E ANS 209-241 213-225 233, 238 6164682 Aspergillus niger NVDG WEPSSNNDNTGIG NHGSCCP E MDIW E ANK 203-235 207-219 227, 232 33733371 Chrysosporium NVEN WQSSTNDANAGTG KYGSCCS E MDVW E ANN 206-238 210-222 230, 235 lucknowense U.S. Pat. No. 6,573,086-10 29160311 Thielavia NVEG WESSTNDANAGSG KYGSCCT E MDVW E ANN 206-238 210-222 230, 235 australiensis 146197087 uncultured symbiotic NVDD WKPQDNDENSGNG KLGTCCS E MDIW E GNM 197-229 201-213 221, 226 protist of Reticulitermes speratus 146197237 uncultured symbiotic NSEG WKPQSGDKNAGNG KYGSCCS E MDVW E SNS 200-232 204-216 224, 229 protist of Neotermes koshunensis 146197067 uncultured symbiotic NVDD WKPQDNDENSGNG KLGTCCS E MDIW E GNM 197-229 201-213 221, 226 protist of Reticulitermes speratus 146197407 uncultured symbiotic NVLD WKPQSNDENSGNG RYGACCT E MDIW E ANS 198-230 202-214 222, 227 protist of Cryptocercus punctulatus 146197157 uncultured symbiotic NVEG WKPSDNDENAGTG KWGACCT E MDIW E ANK 201-233 205-217 225, 230 protist of Hodotermopsis sjoestedti 146197403 uncultured symbiotic NVLD WKPQSNDENSGNG RYGACCT E MDIW E ANS 198-230 202-214 222, 227 protist of Cryptocercus punctulatus 146197081 uncultured symbiotic NVDD WKPQDNDENSGDG KLGTCCS E MDIW E GNA 197-229 201-213 221, 226 protist of Reticulitermes speratus 146197413 uncultured symbiotic NVLD WKPQSNDENSGNG RYGACCT E MDIW E ANS 198-230 202-214 222, 227 protist of Cryptocercus punctulatus 146197309 uncultured symbiotic NSDG WKPQSNDKNSGNG KYGSCCS E MDIW E ANS 196-228 200-212 220, 225 protist of Mastotermes darwiniensis 146197227 uncultured symbiotic NSDG WKPQKNDKNSGNG KYGSCCS E MDIW E ANS 195-227 199-211 219, 224 protist of Neotermes koshunensis 146197253 uncultured symbiotic NSEG WKPQSGDKNAGNG KYGSCCS E MDVW E SNS 200-232 204-216 224, 229 protist of Neotermes koshunensis 146197099 uncultured symbiotic NVLD WKPQSNDENAGTG RYGTCCT E MDIW E ANS 197-229 201-213 221, 226 protist of Reticulitermes speratus 146197409 uncultured symbiotic NVLD WKPQSNDENSGNG RWGARCT E MDIW E ANS 198-230 202-214 222, 227 protist of Cryptocercus punctulatus 146197315 uncultured symbiotic NSDG WKPQSNDKNSGNG KYGSCCS E MDIW E ANS 196-228 200-212 220, 225 protist of Mastotermes darwiniensis 146197411 uncultured symbiotic NVLD WKPQSNDENSGNG RYGACCT E MDIW E ANS 198-230 202-214 222, 227 protist of Cryptocercus punctulatus 146197161 uncultured symbiotic NVQD WKPSDNDDNAGTG HYGACCT E MDIW E ANK 201-233 205-217 225, 230 protist of Hodotermopsis sjoestedti 146197323 uncultured symbiotic NSDG WKPQSNDKNSGNG KYGSCCS E MDIW E ANS 196-228 200-212 220, 225 protist of Mastotermes darwiniensis 146197077 uncultured symbiotic NVLD WKPQETDENSGNG RYGTCCT E MDIW E ANS 201-233 205-217 225, 230 protist of Reticulitermes speratus 146197089 uncultured symbiotic NVED WKPQDNDENSGNG KLGTCCS E MDIW E GNA 197-229 201-213 221, 226 protist of Reticulitermes speratus 146197091 uncultured symbiotic NVLD WKPQSNDENAGTG RYGTCCT E MDIW E ANS 197-229 201-213 221, 226 protist of Reticulitermes speratus 146197097 uncultured symbiotic NVDD WKPQDNDENSGNG KLGTCCS E MDIW E GNA 197-229 201-213 221, 226 protist of Reticulitermes speratus 146197095 uncultured symbiotic NVDD WKPQDNDENSGNG KLGTCCS E MDIW E GNA 197-229 201-213 221, 226 protist of Reticulitermes speratus 146197401 uncultured symbiotic NVLD WKPQSNDENSGNG RYGACCI E MDIW E ANS 198-230 202-214 222, 227 protist of Cryptocercus punctulatus 146197225 uncultured symbiotic NSDG WKPQKNDKNSGNG KYGSCCS E MDIW E ANS 195-227 199-211 219, 224 protist of Neotermes koshunensis 146197317 uncultured symbiotic NSDG WKPQSNDKNSGNG KYGSCCS E MDIW E ANS 196-228 200-212 220, 225 protist of Mastotermes darwiniensis 146197251 uncultured symbiotic NSDG WKPQKNDKNSGNG RYGSCCS E MDVW E ANS 195-227 199-211 219, 224 protist of Neotermes koshunensis 146197319 uncultured symbiotic NSDG WKPQSNDKNSGNG KYGSCCS E MDIW E ANS 196-228 200-212 220, 225 protist of Mastotermes darwiniensis 146197071 uncultured symbiotic NILD WKPSSNDENAGAG RYGTCCT E MDIW E ANS 200-232 204-216 224, 229 protist of Reticulitermes speratus 146197075 uncultured symbiotic NVDD WKPQDNDENSGNG KLGTCCS E MDIW E GNA 197-229 201-213 221, 226 protist of Reticulitermes speratus 146197159 uncultured symbiotic NVKD WKPQETDENAGNG HYGACCT E MDIW E ANS 197-229 201-213 221, 226 protist of Hodotermopsis sjoestedti 146197405 uncultured symbiotic NVLD WKPQSNDENSGNG RYGACCT E MDIW E ANS 198-230 202-214 222, 227 protist of Cryptocercus punctulatus 146197327 uncultured symbiotic NSDG WKPQDNDENSGNG KYGSCCS E MDIW E ANS 201-233 205-217 225, 230 protist of Mastotermes darwiniensis 146197261 uncultured symbiotic NSDG WKPQKNDKNSGNG KYGSCCS E MDIW E ANS 195-227 199-211 219, 224 protist of Neotermes koshunensis

TABLE 5 Tolerance to Tolerance to 250 mg/L Cellobiose Cellobiose Accumulation % Activity in % Activity in 4-MUL Assay Bagasse Assay Substitution(s) (+/−Cellobiose)^(±) (−/+BG)^(¥) None 25% 60% R273K/R422K 95% 84% R273K/Y274Q/ 78% ND D281K/Y410H/ P411G/R422K

TABLE 6 Tolerance to 250 mg/L Cellobiose Tolerance to % Activity in Cellobiose Accumulation 4-MUL Assay % Activity in Bagasse Assay Substitution(s) (+/−Cellobiose)^(±) (−/+BG)^(¥) None 23% 74% R268K/R411K 92% 94% R268A/R411A 92% 95% R268A/R411K 97% 94% R268K/R411A 97% 102% R268K ND 92% R268A ND 86% R411K ND 89% R411A ND 94%

TABLE 7 SEQ ID NO. Amino Acid Sequence MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN TSTNCYTGNT WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSHLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSSNN ANTGLGNHGA CCAELDIWEA NSISEALTPH PCDTPGLSVC TTDACGGTYS SDRYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPITV VTQFVTDDGT STGTLSEIRR YYVQNGVVIP QPSSKISGVS GNVINSDFCD AEISTFGETA SFSKHGGLAK MGAGMEAGMV LVMSLWDDYS VNMLWLDSTY PTNATGTPGA ARGSCPTTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSSTT TASGTTTTKA SSTSTSSTST GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGSYSG NELNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF GPIGSTGNPS GGNPPGGNPP GTTTTRRPAT TTGSSPGPTQ SHYGQCGGIG YSGPTVCASG TTCQVLNPYY SQCL MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN TSTNCYTGNT WNSAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSHLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSANN ANTGIGNHGA CCAELDIWEA NSISEALTPH PCDTPGLSVC TTDACGGTYS SDRYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPFTV VTQFVTNDGT STGSLSEIRR YYVQNGVVIP QPSSKISGIS GNVINSDYCA AEISTFGGTA SFNKHGGLTN MAAGMEAGMV LVMSLWDDYA VNMLWLDSTY PTNATGTPGA ARGTCATTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSTTT TASRTTTTSA SSTSTSSTST GTGVAGHWGQ CGGQGWTGPT TCVSGTTCTV VNPYYSQCL ESACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN CCLDGAAYAS TYGVTTSGNS LSIDFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWQANS ISEALTPHPC TTVGQEICEG DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG NPSG MASSFQLYKA LLFFSSLLSA VQAQKVGTQQ AEVHPGLTWQ TCTSSGSCTT VNGEVTIDAN WRWLHTVNGY TNCYTGNEWD TSICTSNEVC AEQCAVDGAN YASTYGITTS GSSLRLNFVT QSQQKNIGSR VYLMDDEDTY TMFYLLNKEF TFDVDVSELP CGLNGAVYFV SMDADGGKSR YATNEAGAKY GTGYCDSQCP RDLKFINGVA NVEGWESSDT NPNGGVGNHG SCCAEMDIWE ANSISTAFTP HPCDTPGQTL CTGDSCGGTY SNDRYGGTCD PDGCDFNSYR QGNKTFYGPG LTVDTNSPVT VVTQFLTDDN TDTGTLSEIK RFYVQNGVVI PNSESTYPAN PGNSITTEFC ESQKELFGDV DVFSAHGGMA GMGAALEQGM VLVLSLWDDN YSNMLWLDSN YPTDADPTQP GIARGTCPTD SGVPSEVEAQ YPNAYVVYSN IKFGPIGSTF GNGGGSGPTT TVTTSTATST TSSATSTATG QAQHWEQCGG NGWTGPTVCA SPWACTVVNS WYSQCL MYRAIATASA LIAAVRAQQV CSLTQESKPS LNWSKCTSSG CSNVKGSVTI DANWRWTHQV SGSTNCYTGN KWDTSVCTSG KVCAEKCCLD GADYASTYGI TSSGDQLSLS FVTKGPYSTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSDSDVNGGI GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGNSLTA DFCTKQKKVF NDPDDFTKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKSDGTTPTN PTNPSEPSNT ANPNPGTVDQ WGQCGGSNYS GPTACKSGFT CKKINDFYSQ CQ MLASTFSYRM YKTALILAAL LGSGQAQQVG TSQAEVHPSM TWQSCTAGGS CTTNNGKVVI DANWRWVHKV GDYTNCYTGN TWDTTICPDD ATCASNCALE GANYESTYGV TASGNSLRLN FVTTSQQKNI GSRLYMMKDD STYEMFKLLN QEFTFDVDVS NLPCGLNGAL YFVAMDADGG MSKYPTNKAG AKYGTGYCDS QCPRDLKFIN GQANVEGWQP SSNDANAGTG NHGSCCAEMD IWEANSISTA FTPHPCDTPG QVMCTGDACG GTYSSDRYGG TCDPDGCDFN SFRQGNKTFY GPGMTVDTKS KFTVVTQFIT DDGTSSGTLK EIKRFYVQNG KVIPNSESTW TGVSGNSITT EYCTAQKSLF QDQNVFEKHG GLEGMGAALA QGMVLVMSLW DDHSANMLWL DSNYPTTASS TTPGVARGTC DISSGVPADV EANHPDAYVV YSNIKVGPIG STFNSGGSNP GGGTTTTTTT QPTTTTTTAG NPGGTGVAQH YGQCGGIGWT GPTTCASPYT CQKLNDYYSQ CL MLPSTISYRI YKNALFFAAL FGAVQAQKVG TSKAEVHPSM AWQTCAADGT CTTKNGKVVI DANWRWVHDV KGYTNCYTGN TWNAELCPDN ESCAENCALE GADYAATYGA TTSGNALSLK FVTQSQQKNI GSRLYMMKDD NTYETFKLLN QEFTFDVDVS NLPCGLNGAL YFVSMDADGG LSRYTGNEAG AKYGTGYCDS QCPRDLKFIN GLANVEGWTP SSSDANAGNG GHGSCCAEMD IWEANSISTA YTPHPCDTPG QAMCNGDSCG GTYSSDRYGG TCDPDGCDFN SYRQGNKSFY GPGMTVDTKK KMTVVTQFLT NDGTATGTLS EIKRFYVQDG KVIANSESTW PNLGGNSLTN DFCKAQKTVF GDMDTFSKHG GMEGMGAALA EGMVLVMSLW DDHNSNMLWL DSNSPTTGTS TTPGVARGSC DISSGDPKDL EANHPDASVV YSNIKVGPIG STFNSGGSNP GGSTTTTKPA TSTTTTKATT TATTNTTGPT GTGVAQPWAQ CGGIGYSGPT QCAAPYTCTK QNDYYSQCL MHPSLQTILL SALFTTAHAQ QACSSKPETH PPLSWSRCSR SGCRSVQGAV TVDANWLWTT VDGSQNCYTG NRWDTSICSS EKTCSESCCI DGADYAGTYG VTTTGDALSL KFVQQGPYSK NVGSRLYLMK DESRYEMFTL LGNEFTFDVD VSKLGCGLNG ALYFVSMDED GGMKRFPMNK AGAKFGTGYC DSQCPRDVKF INGMANSKDW IPSKSDANAG IGSLGACCRE MDIWEANNIA SAFTPHPCKN SAYHSCTGDG CGGTYSKNRY SGDCDPDGCD FNSYRLGNTT FYGPGPKFTI DTTRKISVVT QFLKGRDGSL REIKRFYVQN GKVIPNSVSR VRGVPGNSIT QGFCNAQKKM FGAHESFNAK GGMKGMSAAV SKPMVLVMSL WDDHNSNMLW LDSTYPTNSR QRGSKRGSCP ASSGRPTDVE SSAPDSTVVF SNIKFGPIGS TFSRGK ESACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN CCLDGAAYAS TYGVTTSGNS LSIDFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWQANS ISEALTPHPC TTVGQEICEG DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG NPSG MHQRALLFSA LAVAANAQQV GTQKPETHPP LTWQKCTAAG SCSQQSGSVV IDANWRWLHS TKDTTNCYTG NTWNTELCPD NESCAQNCAV DGADYAGTYG VTTSGSELKL SFVTGANVGS RLYLMQDDET YQHFNLLNNE FTFDVDVSNL PCGLNGALYF VAMDADGGMS KYPSNKAGAK YGTGYCDSQC PRDLKFINGM ANVEGWKPSS NDKNAGVGGH GSCCPEMDIW EANSISTAVT PHPCDDVSQT MCSGDACGGT YSATRYAGTC DPDGCDFNPF RMGNESFYGP GKIVDTKSEM TVVTQFITAD GTDTGALSEI KRLYVQNGKV IANSVSNVAD VSGNSISSDF CTAQKKAFGD EDIFAKHGGL SGMGKALSEM VLIMSIWDDH HSSMMWLDST YPTDADPSKP GVARGTCEHG AGDPEKVESQ HPDASVTFSN IKFGPIGSTY KA MYRSLIFATS LLSLAKGQLV GNLYCKGSCT AKNGKVVIDA NWRWLHVKGG YTNCYTGNEW NATACPDNKS CATNCAIDGA DYRRLRHYCE RQLLGTEVHH QGLYSTNIGS RTYLMQDDST YQLFKFTGSQ EFTFDVDLSN LPCGLNGALY FVSMDADGGL KKYPTNKAGA KYGTGYCDAQ CPRDLKFING EGNVEGWQPS KNDQNAGVGG HGSCCAEMDI WEANSVSTAV TPHSCSTIEQ SRCDGDGCGG TYSADRYAGV CDPDGCDFNS YRMGVKDFYG KGKTVDTSKK FTVVTQFIGS GDAMEIKRFY VQNGKTIPQP DSTIPGVTGN SITTFFCDAQ KKAFGDKYTF KDKGGMANMP STCNGMVLVM SLWDDHYSNM LWLDSTYPTD KNPDTDAGSG RGECAITSGV PADVESQHPD ASVIYSNIKF GPINTTFG MLAKFAALAA LVASANAQAV CSLTAETHPS LNWSKCTSSG CTNVAGSITV DANWRWTHIT SGSTNCYSGN EWDTSLCSTN TDCATKCCVD GAEYSSTYGI QTSGNSLSLQ FVTKGSYSTN IGSRTYLMNG ADAYQGFELL GNEFTFDVDV SGTGCGLNGA LYFVSMDLDG GKAKYTNNKA GAKYGTGYCD AQCPRDLKYI NGIANVEGWT PSTNDANAGI GDHGTCCSEM DIWEANKVST AFTPHPCTTI EQHMCEGDSC GGTYSDDRYG GTCDADGCDF NSYRMGNTTF YGEGKTVDTS SKFTVVTQFI KDSAGDLAEI KRFYVQNGKV IENSQSNVDG VSGNSITQSF CNAQKTAFGD IDDFNKKGGL KQMGKALAKP MVLVMSIWDD HAANMLWLDS TYPVEGGPGA YRGECPTTSG VPAEVEANAP NSKVIFSNIK FGPIGSTFSG GSSGTPPSNP SSSVKPVTST AKPSSTSTAS NPSGTGAAHW AQCGGIGFSG PTTCQSPYTC QKINDYYSQC V MFKKVALTAL CFLAVAQAQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY TGNSWNSTVC SDPTTCAQRC ALEGANYQQT YGITTNGDAL TIKFLTRSQQ TNVGARVYLM ENENRYQMFN LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGMSKQPNN RAGAKYGTGY CDSQCPRDIK FIDGVANSAD WTPSETDPNA GRGRYGICCA EMDIWEANSI SNAYTPHPCR TQNDGGYQRC EGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTVDT NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITEQFCTDQ KNLFGDYSSF ARDGGLAHMG RSLAKGHVLA LSIWNDHGAH MLWLDSNYPT DADPNKPGIA RGTCPTTGGT PRETEQNHPD AQVIFSNIKF GDIGSTFSGY MYSAAVLATF SFLLGAGAQQ VGTSTAETHP ALTVQKCAAG GTCTDESDSI VLDANWRWLH STSGSTNCYT GNTWDTTLCP DAATCTTNCA LDGADYEGTY GITTSGDSLK LSFVTGSNVG SRTYLMDSET TYKEFALLGN EFTFTVDVSK LPCGLNGALY FVPMDADGGM SKYPTNKAGA KYGTGYCDAQ CPQDMKFVNG TANVEGWVPD SNSANSGTGN IGSCCSEFDV WEANSMSQAL TPHVCTVDSQ TACTGDDCAS NTGVCDGDGC DFNPYRMGNT TFYGSGMTID TSKPFSVVTQ FITDDGTETG TLTEIKRFYV QDDVVYEQPS SDISGVSGNS ITDDFCAAQK TAFGDTDYFT QNGGMAAMGK KMADGMVLVL SIWDDYNVNM LWLDSDYPTT KDASTPGVSR GSCATDSGVP ATVEAASGSA YVTFSSIKYG PIGSTFNAPA DSSSSVSASS SPAPIASSSS SASIAPVSSV VAAIVSSSAQ AISSAAPVVS SSAQAISSAA PVVSSVVSSA APVATSSTKS KCSKVSSTLK TSVAAPATSA TSAAVVATSS AASSTGSVPL YGNCTGGKTC SEGTCVVQND YYSQCVASS MTWQRCTGTG GSSCTNVNGE IVIDANWRWI HATGGYTNCF DGNEWNKTAC PSNAACTKNC AIEGSDYRGT YGITTSGNSL TLKFITKGQY STNVGSRTYL MKDTNNYEMF NLIGNEFTFD VDLSQLPCGL NGALYFVSMP EKGQGTPGAK YGTGKLSQCS VHISKTLTDA CARDLKFVGG EANADGWQAS TSDPNAGVGK KGACCAEMDV WEANSMSTAL TPHSCQPEGY AVCEESNCGG TYSLDRYAGT CDANGCDFNP YRVGNKDFYG KGKTVDTSKK MTVVTQFLGT GSDLTELKRF YVQDGKVISN PEPTIPGMTG NSITQKWCDT QKEVFKEEVY PFNQWGGMAS MGKGMAQGMV LVMSLWDDHY SNMLWLDSTY PTDRDPESPG AARGECAITS GAPAEVEANN PDASVMFSNI KFGPIGSTFQ QPA MQIKSYIQYL AAALPLLSSV AAQQAGTITA ENHPRMTWKR CSGPGNCQTV QGEVVIDANW RWLHNNGQNC YEGNKWTSQC SSATDCAQRC ALDGANYQST YGASTSGDSL TLKFVTKHEY GTNIGSRFYL MANQNKYQMF TLMNNEFAFD VDLSKVECGI NSALYFVAME EDGGMASYPS NRAGAKYGTG YCDAQCARDL KFIGGKANIE GWRPSTNDPN AGVGPMGACC AEIDVWESNA YAYAFTPHAC GSKNRYHICE TNNCGGTYSD DRFAGYCDAN GCDYNPYRMG NKDFYGKGKT VDTNRKFTVV SRFERNRLSQ FFVQDGRKIE VPPPTWPGLP NSADITPELC DAQFRVFDDR NRFAETGGFD ALNEALTIPM VLVMSIWDDH HSNMLWLDSS YPPEKAGLPG GDRGPCPTTS GVPAEVEAQY PNAQVVWSNI RFGPIGSTVN V MRTAKFATLA ALVASAAAQQ ACSLTTERHP SLSWKKCTAG GQCQTVQASI TLDSNWRWTH QVSGSTNCYT GNKWDTSICT DAKSCAQNCC VDGADYTSTY GITTNGDSLS LKFVTKGQYS TNVGSRTYLM DGEDKYQTFE LLGNEFTFDV DVSNIGCGLN GALYFVSMDA DGGLSRYPGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WTGSTNDPNA GAGRYGTCCS EMDIWEANNM ATAFTPHPCT IIGQSRCEGD SCGGTYSNER YAGVCDPDGC DFNSYRQGNK TFYGKGMTVD TTKKITVVTQ FLKDANGDLG EIKRFYVQDG KIIPNSESTI PGVEGNSITQ DWCDRQKVAF GDIDDFNRKG GMKQMGKALA GPMVLVMSIW DDHASNMLWL DSTFPVDAAG KPGAERGACP TTSGVPAEVE AEAPNSNVVF SNIRFGPIGS TVAGLPGAGN GGNNGGNPPP PTTTTSSAPA TTTTASAGPK AGRWQQCGGI GFTGPTQCEE PYTCTKLNDW YSQCL MQIKQYLQYL AAALPLVNMA AAQRAGTQQT ETHPRLSWKR CSSGGNCQTV NAEIVIDANW RWLHDSNYQN CYDGNRWTSA CSSATDCAQK CYLEGANYGS TYGVSTSGDA LTLKFVTKHE YGTNIGSRVY LMNGSDKYQM FTLMNNEFAF DVDLSKVECG LNSALYFVAM EEDGGMRSYS SNKAGAKYGT GYCDAQCARD LKFVGGKANI EGWRPSTNDA NAGVGPYGAC CAEIDVWESN AYAFAFTPHG CLNNNYHVCE TSNCGGTYSE DRFGGLCDAN GCDYNPYRMG NKDFYGKGKT VDTSRKFTVV TRFEENKLTQ FFIQDGRKID IPPPTWPGLP NSSAITPELC TNLSKVFDDR DRYEETGGFR TINEALRIPM VLVMSIWDGH YANMLWLDSV YPPEKAGQPG AERGPCAPTS GVPAEVEAQF PNAQVIWSNI RFGPIGSTYQ V MMYKKFAALA ALVAGAAAQQ ACSLTTETHP RLTWKRCTSG GNCSTVNGAV TIDANWRWTH TVSGSTNCYT GNEWDTSICS DGKSCAQTCC VDGADYSSTY GITTSGDSLN LKFVTKHQHG TNVGSRVYLM ENDTKYQMFE LLGNEFTFDV DVSNLGCGLN GALYFVSMDA DGGMSKYSGN KAGAKYGTGY CDAQCPRDLK FINGEANIEN WTPSTNDANA GFGRYGSCCS EMDIWDANNM ATAFTPHPCT IIGQSRCEGN SCGGTYSSER YAGVCDPDGC DFNAYRQGDK TFYGKGMTVD TTKKMTVVTQ FHKNSAGVLS EIKRFYVQDG KIIANAESKI PGNPGNSITQ EWCDAQKVAF GDIDDFNRKG GMAQMSKALE GPMVLVMSVW DDHYANMLWL DSTYPIDKAG TPGAERGACP TTSGVPAEIE AQVPNSNVIF SNIRFGPIGS TVPGLDGSTP SNPTATVAPP TSTTTSVRSS TTQISTPTSQ PGGCTTQKWG QCGGIGYTGC TNCVAGTTCT ELNPWYSQCL MYRNFLYAAS LLSVARSQLV GTQTTETHPG MTWQSCTAKG SCTTCSDNKA CASNCAVDGA DYKGTYGITA SGNSLQLKFI TKGSYSTNIG SRTYLMASDT AYQMFKFDGN KEFTFDVDLS GLPCGFNGAL YFVSMDEDGG LKKYSGNKAG AKYGTGYCDA QCPRDLKFIN GEGNVEGWKP SDNDANAGVG GHGSCCAEMD IWEANSISTA VTPHACSTIE QTRCDGDGCG GTYSADRYAG VCDPDGCDFN AYRMGVKNFY GKGMTVDTSK KFTVVTQFIG TGDAMEIKRF YVQGGKTIEQ PASTIPGVEG NSITTKFCDQ QKQVFGDRYT YKEKGGTANM AKALAQGMVL VMSLWDDHYS NMLWLDSTYP TDKNPDTDLG SGRGSCDVKS GAPADVESKS PDATVIYSNI KFGPLNSTY MLGKIAIASL SFLAIAKGQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY TGNSWNSSVC SDGTTCAQRC ALEGANYQQT YGITTSGNSL TMKFLTRSQG TNVGGRVYLM ENENRYQMFN LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGMSSQPNN RAGAKYGTGY CDSQCPRDIK FIDGVANSVG WEPSETDSNA GRGRYGICCA EMDIWEANSI SNAYTPHPCR TQNDGGYQRC EGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTIDT NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITEQFCTDQ KNLFGDYSSF ARDGGLAHMG RSLAKGHVLA LSIWNDHGAH MLWLDSNYPT DADPNKPGIA RGTCPTTGGT PRETEQNHPD AQVIFSNIKF GDIGSTFSGY MFPRSILLAL SLTAVALGQQ VGTNMAENHP SLTWQRCTSS GCQNVNGKVT LDANWRWTHR INDFTNCYTG NEWDTSICPD GVTCAENCAL DGADYAGTYG VTSSGTALTL KFVTESQQKN IGSRLYLMAD DSNYEIFNLL NKEFTFDVDV SKLPCGLNGA LYFSEMAADG GMSSTNTAGA KYGTGYCDSQ CPRDIKFIDG EANSEGWEGS PNDVNAGTGN FGACCGEMDI WEANSISSAY TPHPCREPGL QRCEGNTCSV NDRYATECDP DGCDFNSFRM GDKSFYGPGM TVDTNQPITV VTQFITDNGS DNGNLQEIRR IYVQNGQVIQ NSNVNIPGID SGNSISAEFC DQAKEAFGDE RSFQDRGGLS GMGSALDRGM VLVLSIWDDH AVNMLWLDSD YPLDASPSQP GISRGTCSRD SGKPEDVEAN AGGVQVVYSN IKFGDINSTF NNNGGGGGNP SPTTTRPNSP AQTMWGQCGG QGWTGPTACQ SPSTCHVIND FYSQCF MYRNLALASL SLFGAARAQQ AGTVTTETHP SLSWKTCTGT GGTSCTTKAG KITLDANWRW THVTTGYTNC YDGNSWNTTA CPDGATCTKN CAVDGADYSG TYGITTSSNS LSIKFVTKGS NSANIGSRTY LMESDTKYQM FNLIGQEFTF DVDVSKLPCG LNGALYFVEM AADGGIGKGN NKAGAKYGTG YCDSQCPHDI KFINGKANVE GWNPSDADPN AGSGKIGACC PEMDIWEANS ISTAYTPHPC KGTGLQECTD DVSCGDGSNR YSGLCDKDGC DFNSYRMGVK DFYGPGATLD TTKKMTVVTQ FLGSGSTLSE IKRFYVQNGK VFKNSDSAIE GVTGNSITES FCAAQKTAFG DTNSFKTLGG LNEMGASLAR GHVLVMSLWD DHAVNMLWLD STYPTNSTKL GAQRGTCAID SGKPEDVEKN HPDATVVFSD IKFGPIGSTF QQPS MVDIQIATFL LLGVVGVAAQ QVGTYIPENH PLLATQSCTA SGGCTTSSSK IVLDANRRWI HSTLGTTSCL TANGWDPTLC PDGITCANYC ALDGVSYSST YGITTSGSAL RLQFVTGTNI GSRVFLMADD THYRTFQLLN QELAFDVDVS KLPCGLNGAL YFVAMDADGG KSKYPGNRAG AKYGTGYCDS QCPRDVQFIN GQANVQGWNA TSATTGTGSY GSCCTELDIW EANSNAAALT PHTCTNNAQT RCSGSNCTSN TGFCDADGCD FNSFRLGNTT FLGAGMSVDT TKTFTVVTQF ITSDNTSTGN LTEIRRFYVQ NGNVIPNSVV NVTGIGAVNS ITDPFCSQQK KAFIETNYFA QHGGLAQLGQ ALRTGMVLAF SISDDPANHM LWLDSNFPPS ANPAVPGVAR GMCSITSGNP ADVGILNPSP YVSFLNIKFG SIGTTFRPA MHQRALLFSA LAVAANAQQV GTQTPETHPP LTWQKCTAAG SCSQQSGSVV IDANWRWLHS TKDTTNCYTG NTWNTELCPD NESCAQNCAL DGADYAGTYG VTTSGSELKL SFVTGANVGS RLYLMQDDET YQHFNLLNHE FTFDVDVSNL PCGLNGALYF VAMDADGGMS KYPSNKAGAK YGTGYCDSQC PRDLKFINGM ANVEGWEPSS SDKNAGVGGH GSCCPEMDIW EANSISTAVT PHPCDDVSQT MCSGDACGGT YSESRYAGTC DPDGCDFNPF RMGNESFYGP GKIVDTKSKM TVVTQFITAD GTDSGALSEI KRLYVQNGKV IANSVSNVAG VSGNSITSDF CTAQKKAFGD EDIFAKHGGL SGMGKALSEM VLIMSIWDDH HSSMMWLDST YPTDADPSKP GVARGTCEHG AGDPENVESQ HPDASVTFSN IKFGPIGSTY EG MFRTATLLAF TMAAMVFGQQ VGTNTAENHR TLTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT GNQWDATLCP DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN GKVIQNSSVK IPGIDPVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG IGYTGSTTCA SPYTCHVLNP YYSQCY MYQRALLFSF FLAAARAHEA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA LYFVAMDADG NLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSSCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF NPYQPGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW LDSTYPTDAD PDTPGVARGT CPTTSGVPAD VESQNPNSYV IYSNIKVGPI NSTFTAN MQIKSYIQYL AAALPLLSSV AAQQAGTITA ENHPRMTWKR CSGPGNCQTV QGEVVIDANW RWLHNNGQNC YEGNKWTSQC SSATDCAQRC ALDGANYQST YGASTSGDSL TLKFVTKHEY GTNIGSRFYL MANQNKYQMF TLMNNEFAFD VDLSKVECGI NSALYFVAME EDGGMASYPS NRAGAKYGTG YCDAQCARDL KFIGGKANIE GWRPSTNDPN AGVGPMGACC AEIDVWESNA YAYAFTPHAC GSKNRYHICE TNNCGGTYSD DRFAGYCDAN GCDYNPYRMG NKDFYGKGKT VDTNRKFTVV SRFERNRLSQ FFVQDGRKIE VPPPTWPGLP NSADITPELC DAQFRVFDDR NRFAETGGFD ALNEALTIPM VLVMSIWDDH HSNMLWLDSS YPPEKAGLPG GDRGPCPTTS GVPAEVEAQY PDAQVVWSNI RFGPIGSTVN V MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW DPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGSYSG NGLNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF GPIGSTGDPS GGNPPGGNPP GTTTTRRPAT TTGSSPGPTQ SHYGQCGGIG YSGPTVCASG TTCQVLNPYY SQCL MYQRALLFSF FLAAARAQQA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA LYFVAMDADG GLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSCCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF NPYRQGNHSF YGPGQIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW LDSTYPTDAD PDTPGVARGT CPTTSGVPAD VESQYPNSYV IYSNIKVGPI NSTFTAN MIRKITTLAA LVGVVRGQAA CSLTAETHPS LTWQKCSSGG SCTNVAGSVT IDANWRWTHT TSGYTNCYTG NKWDTSICST NADCASKCCV DGANYQQTYG ASTSGNALSL QYVTQSSGKN VGSRLYLLES ENKYQMFNLL GNEFTFDVDA SKLGCGLNGA VYFVSMDADG GQSKYSGNKA GAKYGTGYCD SQCPRDLKYI NGAANVEGWQ PSSGDANSGV GNMGSCCAEM DIWEANSIST AYTPHPCSNN AQHSCKGDDC GGTYSSVRYA GDCDPDGCDF NSYRQGNRTF YGPGSNFNVD SSKKVTVVTQ FISSGGQLTD IKRFYVQNGK VIPNSQSTIT GVTGNSVTQD YCDKQKTAFG DQNVFNQRGG LRQMGDALAK GMVLVMSVWD DHHSQMLWLD STYPTTSTAP GAARGSCSTS SGKPSDVQSQ TPGATVVYSN IKFGPIGSTF KSS MLRRALLLSS SAILAVKAQQ AGTATAENHP PLTWQECTAP GSCTTQNGAV VLDANWRWVH DVNGYTNCYT GNTWDPTYCP DDETCAQNCA LDGADYEGTY GVTSSGSSLK LNFVTGSNVG SRLYLLQDDS TYQIFKLLNR EFSFDVDVSN LPCGLNGALY FVAMDADGGV SKYPNNKAGA KYGTGYCDSQ CPRDLKFIDG EANVEGWQPS SNNANTGIGD HGSCCAEMDV WEANSISNAV TPHPCDTPGQ TMCSGDDCGG TYSNDRYAGT CDPDGCDFNP YRMGNTSFYG PGKIIDTTKP FTVVTQFLTD DGTDTGTLSE IKRFYIQNSN VIPQPNSDIS GVTGNSITTE FCTAQKQAFG DTDDFSQHGG LAKMGAAMQQ GMVLVMSLWD DYAAQMLWLD SDYPTDADPT TPGIARGTCP TDSGVPSDVE SQSPNSYVTY SNIKFGPINS TFTAS MHQRALLFSA FWTAVQAQQA GTLTAETHPS LTWQKCAAGG TCTEQKGSVV LDSNWRWLHS VDGSTNCYTG NTWDATLCPD NESCASNCAL DGADYEGTYG VTTSGDALTL QFVTGANIGS RLYLMADDDE SYQTFNLLNN EFTFDVDASK LPCGLNGAVY FVSMDADGGV AKYSTNKAGA KYGTGYCDSQ CPRDLKFING QVRKGWEPSD SDKNAGVGGH GSCCPQMDIW EANSISTAYT PHPCDDTAQT MCEGDTCGGT YSSERYAGTC DPDGCDFNAY RMGNESFYGP SKLVDSSSPV TVVTQFITAD GTDSGALSEI KRFYVQGGKV IANAASNVDG VTGNSITADF CTAQKKAFGD DDIFAQHGGL QGMGNALSSM VLTLSIWDDH HSSMMWLDSS YPEDADATAP GVARGTCEPH AGDPEKVESQ SGSATVTYSN IKYGPIGSTF DAPA MASTLSFKIY KNALLLAAFL GAAQAQQVGT STAEVHPSLT WQKCTAGGSC TSQSGKVVID SNWRWVHNTG GYTNCYTGND WDRTLCPDDV TCATNCALDG ADYKGTYGVT ASGSSLRLNF VTQASQKNIG SRLYLMADDS KYEMFQLLNQ EFTFDVDVSN LPCGLNGALY FVAMDEDGGM ARYPTNKAGA KYGTGYCDAQ CPRDLKFING QANVEGWEPS SSDVNGGTGN YGSCCAEMDI WEANSISTAF TPHPCDDPAQ TRCTGDSCGG TYSSDRYGGT CDPDGCDFNP YRMGNQSFYG PSKIVDTESP FTVVTQFITN DGTSTGTLSE IKRFYVQNGK VIPQSVSTIS AVTGNSITDS FCSAQKTAFK DTDVFAKHGG MAGMGAGLAE GMVLVMSLWD DHAANMLWLD STYPTSASST TPGAARGSCD ISSGEPSDVE ANHSNAYVVY SNIKVGPLGS TFGSTDSGSG TTTTKVTTTT ATKTTTTTGP STTGAAHYAQ CGGQNWTGPT TCASPYTCQR QGDYYSQCL MVSAKFAALA ALVASASAQQ VCSLTPESHP PLTWQRCSAG GSCTNVAGSV TLDSNWRWTH TLQGSTNCYS GNEWDTSICT TGTKCAQNCC VEGAEYAATY GITTSGNQLN LKFVTEGKYS TNVGSRTYLM ENATKYQGFN LLGNEFTFDV DVSNIGCGLN GALYFVSMDL DGGLAKYSGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WNPSTNDVNA GAGRYGTCCS EMDIWEANNM ATAYTPHSCT ILDQSRCEGE SCGGTYSSDR YGGVCDPDGC DFNSYRMGNK EFYGKGKTVD TTKKMTVVTQ FLKNAAGELS EIKRFYVQNG VVIPNSVSSI PGVPNQNSIT QDWCDAQKIA FGDPDDNTAK GGLRQMGLAL DKPMVLVMSI WNDHAAHMLW LDSTYPVDAA GRPGAERGAC PTTSGVPSEV EAEAPNSNVA FSNIKFGPIG STFNSGSTNP NPISSSTATT PTSTRVSSTS TAAQTPTSAP GGTVPRWGQC GGQGYTGPTQ CVAPYTCVVS NQWYSQCL MFPYIALVSF SFLSVVLAQQ VGTLTAETHP QLTVQQCTRG GSCTTQQRSV VLDGNWRWLH STSGSNNCYT GNTWDTSLCP DAATCSRNCA LDGADYSGTY GITSSGNALT LKFVTHGPYS TNIGSRVYLL ADDSHYQMFN LKNKEFTFDV DVSQLPCGLN GALYFSQMDA DGGTGRFPNN KAGAKYGTGY CDSQCPHDIK FINGEANVQG WQPSPNDSNA GKGQYGSCCA EMDIWEANSM ASAYTPHPCT VTTPTRCQGN DCGDGDNRYG GVCDKDGCDF NSFRMGDKNF LGPGKTVNTN SKFTVVTQFL TSDNTTSGTL SEIRRLYVQN GRVIQNSKVN IPGMASTLDS ITESFCSTQK TVFGDTNSFA SKGGLRAMGN AFDKGMVLVL SIWDDHEAKM LWLDSNYPLD KSASAPGVAR GTCATTSGEP KDVESQSPNA QVIFSNIKYG DIGSTYSN MYRAIATASA LIAAVRAQQV CSLTQESKPS LNWSKCTSSG CSNVKGSVTI DANWRWTHQV SGSTNCYTGN KWDTSVCTSG KVCAERCCLD GADYASTYGI TSSGDQLSLS FVTKGPYSTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSDSDVNGGI GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGNSLTA DFCTKQKKVF NDPDDFTKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKSDGTTPTN PTNPSEPSNT ANPNPGTVDQ WGQCGGSNYS GPTACKSGFT CKKINDFYSQ CQ MYSAAVLATF SFLLGAGAQQ VGTLKTESHP PLTIQKCAAG GTCTDEADSV VLDANWRWLH STSGSTNCYT GNTWDTTLCP DAATCTANCA FDGADYEGTY GITSSGDSLK LSFVTGSNVG SRTYLMDSET TYKEFALLGN EFTFTVDVSK LPCGLNGALY FVPMDADGGM SKYPTNKAGA KYGTGYCDAQ CPQDMKFVSG GANNEGWVPD SNSANSGTGN IGSCCSEFDV WEANSMSQAL TPHTCTVDGQ TACTGDDCAG NTGVCDADGC DFNPYRMGNT TFYGSGKTID TTKPFSVVTQ FITDDGTETG TLTEIKRFYV QDDVVYEQPN SDISGVSGNS ITDDFCTAQK TAFGDTDYFS QKGGMAAMGK KMADGMVLVL SIWDDYNVNM LWLDSDYPTT KDASTPGVSR GSCATTSGVP ATVEAASGSA YVTFSSIKYG PIGSTFKAPA DSSSPVVASS SPAAVAAVVS TSSAQAVPSH PAVSSSQAAV STPEAVSSAP EVPASSSAAQ SVAPTSTKPK CSKVSQSSTL ATSVAAPATT ATSAAVAATS AASSSGSVPL YGNCTGGKTC SEGTCVVQNP WYSQCVASS MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT GNEWDTSLCP DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDGDGCDF NSFRMGDKTF LGKGMTVDTS KPFTVVTQFL TNDNTSTGTL SEIRRIYIQN GKVIQNSVAN IPGVDPVNSI TDNFCAQQKT AFGDTNWFAQ KGGLKQMGEA LGNGMVLALS IWDDHAANML WLDSDYPTDK DPSAPGVARG TCATTSGVPS DVESQVPNSQ VVFSNIKFGD IGSTFSGTSS PNPPGGSTTS SPVTTSPTPP PTGPTVPQWG QCGGIGYSGS TTCASPYTCH VLNPYYSQCY MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGSYSG NELNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF GPIGSTGNPS GGNPPGGNRG TTTTRRPATT TGSSPGPTQS HYGQCGGIGY SGPTVCASGT TCQVLNPYYS QCL MPSTYDIYKK LLLLASFLSA SQAQQVGTSK AEVHPSLTWQ TCTSGGSCTT VNGKVVVDAN WRWVHNVDGY NNCYTGNTWD TTLCPDDETC ASNCALEGAD YSGTYGVTTS GNSLRLNFVT QASQKNIGSR LYLMEDDSTY KMFKLLNQEF TFDVDVSNLP CGLNGAVYFV SMDADGGMAK YPANKAGAKY GTGYCDSQCP RDLKFINGMA NVEGWEPSAN DANAGTGNHG SCCAEMDIWE ANSISTAYTP HPCDTPGQVM CTGDSCGGTY SSDRYGGTCD PDGCDFNSYR QGNKTFYGPG MTVDTKSKIT VVTQFLTNDG TASGTLSEIK RFYVQNGKVI PNSESTWSGV SGNSITTAYC NAQKTLFGDT DVFTKHGGME GMGAALAEGM VLVLSLWDDH NSNMLWLDSN YPTDKPSTTP GVARGSCDIS SGDPKDVEAN DANAYVVYSN IKVGPIGSTF SGSTGGGSSS STTATSKTTT TSATKTTTTT TKTTTTTSAS STSTGGAQHW AQCGGIGWTG PTTCVAPYTC QKQNDYYSQC L MISKVLAFTS LLAAARAQQA GTLTTETHPP LSVSQCTASG CTTSAQSIVV DANWRWLHST TGSTNCYTGN TWDKTLCPDG ATCAANCALD GADYSGVYGI TTSGNSIKLN FVTKGANTNV GSRTYLMAAG STTQYQMLKL LNQEFTFDVD VSNLPCGLNG ALYFAAMDAD GGLSRFPTNK AGAKYGTGYC DAQCPQDIKF INGVANSVGW TPSSNDVNAG AGQYGSCCSE MDIWEANKIS AAYTPHPCSV DTQTRCTGTD CGIGARYSSL CDADGCDFNS YRQGNTSFYG AGLTVNTNKV FTVVTQFITN DGTASGTLKE IRRFYVQNGV VIPNSQSTIA GVPGNSITDS FCAAQKTAFG DTNEFATKGG LATMSKALAK GMVLVMSIWD DHTANMLWLD APYPATKSPS APGVTRGSCS ATSGNPVDVE ANSPGSSVTF SNIKWGPINS TYTGSGAAPS VPGTTTVSSA PASTATSGAG GVAKYAQCGG SGYSGATACV SGSTCVALNP YYSQCQ MFPAATLFAF SLFAAVYGQQ VGTQLAETHP RLTWQKCTRS GGCQTQSNGA IVLDANWRWV HNVGGYTNCY TGNTWNTSLC PDGATCAKNC ALDGANYQST YGITTSGNAL TLKFVTQSEQ KNIGSRVYLL ESDTKYQLFN PLNQEFTFDV DVSQLPCGLN GAVYFSAMDA DGGMSKFPNN AAGAKYGTGY CDSQCPRDIK FINGEANVQG WQPSPNDTNA GTGNYGACCN EMDVWEANSI STAYTPHPCT QQGLVRCSGT ACGGGSNRYG SICDPDGCDF NSFRMGDKSF YGPGLTVNTQ QKFTVVTQFL TNNNSSSGTL REIRRLYVQN GRVIQNSKVN IPGMPSTMDS VTTEFCNAQK TAFNDTFSFQ QKGGMANMSE ALRRGMVLVL SIWDDHAANM LWLDSNYPTD RPASQPGVAR GTCPTSSGKP SDVENSTANS QVIYSNIKFG DIGSTYSA MKGSISYQIY KGALLLSALL NSVSAQQVGT LTAETHPALT WSKCTAGXCS QVSGSVVIDA NWPXVHSTSG STNCYTGNTW DATLCPDDVT CAANCAVDGA RRQHLRVTTS GNSLRINFVT TASQKNIGSR LYLLENDTTY QKFNLLNQEF TFDVDVSNLP CGLNGALYFV DMDADGGMAK YPTNKAGAKY GTGYCDSQCP RDLKFINGQA NVDGWTPSKN DVNSGIGNHG SCCAEMDIWE ANSISNAVTP HPCDTPSQTM CTGQRCGGTY STDRYGGTCD PDGCDFNPYR MGVTNFYGPG ETIDTKSPFT VVTQFLTNDG TSTGTLSEIK RFYVQGGKVI GNPQSTIVGV SGNSITDSWC NAQKSAFGDT NEFSKHGGMA GMGAGLADGM VLVMSLWDDH ASDMLWLDST YPTNATSTTP GAKRGTCDIS RRPNTVESTY PNAYVIYSNI KTGPLNSTFT GGTTSSSSTT TTTSKSTSTS SSSKTTTTVT TTTTSSGSSG TGARDWAQCG GNGWTGPTTC VSPYTCTKQN DWYSQCL MFRTAALTAF TLAAVVLGQQ VGTLTAENHP ALSIQQCTAS GCTTQQKSVV LDSNWRWTHS LPVHTNCYTG NAWDASLCPD PTTCATNCAI DGADYSGTYG ITTSGNALTL RFVTNGPYSK NIGSRVYLLD DADHYKMFDL KNQEFTFDVD MSGLPCGLNG ALYFSEMPAD GGKAAHTSNK AGAKYGTGYC DAQCPHDIKW INGEANILDW SASATDANAG NGRYGACCAE MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGDNRYGG VCDKDGCDFN SYRMGDKNFL GRGKTIDTTK KITVVTQFIT DDNTSSGNLV EIRRVYVQDG VTYQNSFSTF PSLSQYNSIS DDFCVAQKTL FGDNQYYNTH GGTEKMGDAM ANGMVLIMSL WSDHAAHMLW LDSDYPLDKS PSEPGVSRGA CATTTGDPDD VVANHPNASV TFSNIKYGPI GSTYGGSTPP VSSGNTSAPP VTSTTSSGPT TPTGPTGTVP KWGQCGGNGY SGPTTCVAGS TCTYSNDWYS QCL MYQRALLFSA LLSVSRAQQA GTAQEEVHPS LTWQRCEASG SCTEVAGSVV LDSNWRWTHS VDGYTNCYTG NEWDATLCPD NESCAQNCAV DGADYEATYG ITSNGDSLTL KFVTGSNVGS RVYLMEDDET YQMFDLLNNE FTFDVDVSNL PCGLNGALYF TSMDADGGLS KYEGNTAGAK YGTGYCDSQC PRDIKFINGL GNVEGWEPSD SDANAGVGGM GTCCPEMDIW EANSISTAYT PHPCDSVEQT MCEGDSCGGT YSDDRYGGTC DPDGCDFNSY RMGNTSFYGP GAIIDTSSKF TVVTQFIADG GSLSEIKRFY VQNGEVIPNS ESNISGVEGN SITSEFCTAQ KTAFGDEDIF AQHGGLSAMG DAASAMVLIL SIWDDHHSSM MWLDSSYPTD ADPSQPGVAR GTCEQGAGDP DVVESEHADA SVTFSNIKFG PIGSTF MYRAIATASA LIAAVRAQQV CSLTTETKPA LTWSKCTSSG CSNVQGSVTI DANWRWTHQV SGSTNCHTGN KWDTSVCTSG KVCAEKCCVD GADYASTYGI TSSGNQLSLS FVTKGSYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWE PSKSDVNGGI GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHACTGDSC GGTYSNDRYG GTCDADGCDF NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGNPGSSLTS DFCTTQKKVF GDIDDFAKKG AWNGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTA LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YNKEGTQPQP TNPTNPNPTN PTNPGTVDQW GQCGGTNYSG PTACKSPFTC KKINDFYSQC Q MFRTAALTAF TLAAVVLGQQ VGTLAAENHP ALSIQQCTAS GCTTQQKSVV LDSNWRWTHS TAGATNCYTG NAWDSSLCPN PTTCATNCAI DGADYSGTYG ITTSGNSLTL RFVTNGQYSE NIGSRVYLLD DADHYKLFNL KNQEFTFDVD MSGLPCGLNG ALYFSEMAAD GGKAAHTGNN AGAKYGTGYC DAQCPHDIKW INGEANILDW SGSATDPNAG NGRYGACCAE MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGDNRYGG VCDKDGCDFN SYRMGDKNFL GRGKTIDTTK KITVVTQFIT DDNTPTGNLV EIRRVYVQDG VTYQNSFSTF PSLSQYNSIS DDFCVAQKTL FGDNQYYNTH GGTEKMGDSL ANGMVLIMSL WSDHAAHMLW LDSDYPLDKS PSEPGVSRGA CATTTGDPDD VVANHPNASV TFSNIKYGPI GSTYGGSTPP VSSGNTSVPP VTSTTSSGPT TPTGPTGTVP KWGQCGGIGY SGPTSCVAGS TCTYSNEWYS QCL MYQKLALISA FLATARAQSA CTLQAETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSADSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVTKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDSC GGTYSGDRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGDYSG NSLDDDYCAA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT DETSSTPGAV RGSSSTSSGV PAQLESNSPN AKVVYSNIKF GPIGSTGNPS GGNPPGGNPP GTTTPRPATS TGSSPGPTQT HYGQCGGIGY IGPTVCASGS TCQVLNPYYS QCL MTWQSCTAKG SCTNKNGKIV IDANWRWLHK KEGYDNCYTG NEWDATACPD NKACAANCAV DGADYSGTYG ITAGSNSLKL KFITKGSYST NIGSRTYLMK DDTTYEMFKF TGNQEFTFDV DVSNLPCGFN GALYFVSMDA DGGLKKYSTN KAGAKYGTGY CDAQCPRDLK FINGEGNVEG WKPSSNDANA GVGGHGSCCA EMDIWEANSV STAVTPHSCS TIEQSRCDGD GCGGTYSADR YAGVCDPDGC DFNSYRMGVK DFYGKGKTVD TSKKFTVVTQ FIGTGDAMEI KRFYVQNGKT IAQPASAVPG VEGNSITTKF CDQQKAVFGD TYTFKDKGGM ANMAKALANG MVLVMSLWDD HYSNMLWLDS TYPTDKNPDT DLGTGRGECE TSSGVPADVE SQHADATVVY SNIKFGPLNS TFG MASAISFQVY RSALILSAFL PSITQAQQIG TYTTETHPSM TWETCTSGGS CATNQGSVVM DANWRWVHQV GSTTNCYTGN TWDTSICDTD ETCATECAVD GADYESTYGV TTSGSQIRLN FVTQNSNGAN VGSRLYMMAD NTHYQMFKLL NQEFTFDVDV SNLPCGLNGA LYFVTMDEDG GVSKYPNNKA GAQYGVGYCD SQCPRDLKFI QGQANVEGWT PSSNNENTGL GNYGSCCAEL DIWESNSISQ ALTPHPCDTA TNTMCTGDAC GGTYSSDRYA GTCDPDGCDF NPYRMGNTTF YGPGKTIDTN SPFTVVTQFI TDDGTDTGTL SEIRRYYVQN GVTYAQPDSD ISGITGNAIN ADYCTAENTV FDGPGTFAKH GGFSAMSEAM STGMVLVMSL WDDYYADMLW LDSTYPTNAS SSTPGAVRGS CSTDSGVPAT IESESPDSYV TYSNIKVGPI GSTFSSGSGS GSSGSGSSGS ASTSTTSTKT TAATSTSTAV AQHYSQCGGQ DWTGPTTCVS PYTCQVQNAY YSQCL MKAYFEYLVA ALPLLGLATA QQVGKQTTET HPKLSWKKCT GKANCNTVNA EVVIDSNWRW LHDSSGKNCY DGNKWTSACS SATDCASKCQ LDGANYGTTY GASTSGDALT LKFVTKHEYG TNIGSRFYLM NGASKYQMFT LMNNEFAFDV DLSTVECGLN AALYFVAMEE DGGMASYSSN KAGAKYGTGY CDAQCARDLK FVGGKANIEG WTPSTNDANA GVGPYGGCCA EIDVWESNAH SFAFTPHACK TNKYHVCERD NCGGTYSEDR FAGLCDANGC DYNPYRMGNT DFYGKGKTVD TSKKFTVVSR FEENKLTQFF VQNGQKIEIP GPKWDGIPSD NANITPEFCS AQFQAFGDRD RFAEVGGFAQ LNSALRMPMV LVMSIWDDHY ANMLWLDSVY PPEKEGQPGA ARGDCPQSSG VPAEVESQYA NSKVVYSNIR FGPVGSTVNV MFSKFALTGS LLAGAVNAQG VGTQQTETHP QMTWQSCTSP SSCTTNQGEV VIDSNWRWVH DKDGYVNCYT GNTWNTTLCP DDKTCAANCV LDGADYSSTY GITTSGNALS LQFVTQSSGK NIGSRTYLME SSTKYHLFDL IGNEFAFDVD LSKLPCGLNG ALYFVTMDAD GGMAKYSTNT AGAEYGTGYC DSQCPRDLKF INGQGNVEGW TPSTNDANAG VGGLGSCCSE MDVWEANSMD MAYTPHPCET AAQHSCNADE CGGTYSSSRY AGDCDPDGCD WNPFRMGNKD FYGSGDTVDT SQKFTVVTQF HGSGSSLTEI SQYYIQGGTK IQQPNSTWPT LTGYNSITDD FCKAQKVEFN DTDVFSEKGG LAQMGAGMAD GMVLVMSLWD DHYANMLWLD STYPVDADAS SPGKQRGTCA TTSGVPADVE SSDASATVIY SNIKFGPIGA TY MFPAAALLSF TLLAVASAQQ IGTNTAEVHP SLTVSQCTTS GGCTSSTQSI VLDANWRWLH STSGYTNCYT GNQWNSDLCP DPDTCATNCA LDGASYESTY GISTDGNAVT LNFVTQGSQT NVGSRVYLLS DDTHYQTFSL LNKEFSFDVD ASNIGCGING AVYFVQMDAD GGLSKYSSNK AGAQYGTGYC DSQCPQDIKF INGEANLLDW NATSANSGTG SYGSCCPEMD IWEANKYAAA YTPHPCSVSG QTRCTGTSCG AGSERYDGYC DKDGCDFNSW RMGNETFLGP GMTIDTNKKF TIVTQFITDD NTANGTLSEI RRLYVQGGTV IQNSVANQPN IPKVNSITDS FCTAQKTEFG DQDYFGTIGG LSQMGKAMSD MVLVMSIWDD YDAEMLWLDS NYPTSGSAST PGISRGPCSA TSGLPATVES QQASASVTYS NIKWGDIGST YSGSGSSGSS SSSSSSAASA STSTHTSAAA TATSSAAAAT GSPVPAYGQC GGQSYTGSTT CASPYVCKVS NAYYSQCLPA MKRALCASLS LLAAAVAQQV GTNEPEVHPK MTWKKCSSGG SCSTVNGEVV IDGNWRWIHN IGGYENCYSG NKWTSVCSTN ADCATKCAME GAKYQETYGV STSGDALTLK FVQQNSSGKN VGSRMYLMNG ANKYQMFTLK NNEFAFDVDL SSVECGMNSA LYFVPMKEDG GMSTEPNNKA GAKYGTGYCD AQCARDLKFI GGKGNIEGWQ PSSTDSSAGI GAQGACCAEI DIWESNKNAF AFTPHPCENN EYHVCTEPNC GGTYADDRYG GGCDANGCDY NPYRMGNPDF YGPGKTIDTN RKFTVISRFE NNRNYQILMQ DGVAHRIPGP KFDGLEGETG ELNEQFCTDQ FTVFDERNRF NEVGGWSKLN AAYEIPMVLV MSIWSDHFAN MLWLDSTYPP EKAGQPGSAR GPCPADGGDP NGVVNQYPNA KVIWSNVRFG PIGSTYQVD MQLTKAGVFL GALMGGAAAQ QVGTQTAENH PKMTWKKCTG KASCTTVNGE VVIDANWRWL HDASSKNCYD GNRWTDSCRT ASDCAAKCSL EGADYAKTYG ASTSGDALSL KFVTRHDYGT NIGSRFYLMN GASKYQMFSL LGNEFAFDVD LSTIECGLNS ALYFVAMEED GGMKSYSSNK AGAKYGTGYC DAQCARDLKF VGGKANIEGW KPSSNDANAG VGPYGACCAE IDVWESNAHA FAFTPHPCTD NKYHVCQDSN CGGTYSDDRF AGKCDANGCD INPYRLGNTD FYGKGKTVDT SKKFTVVTRF ERDALTQFFV QNNKRIDMPS PALEGLPATG AITAEYCTNV FNVFGDRNRF DEVGGWSQLQ QALSLPMVLV MSIWDDHYSN MLWLDSVYPP DKEGSPGAAR GDCPQDSGVP SEVESQIPGA TVVWSNIRFG PVGSTVNV MYRIVATASA LIAAARAQQV CSLNTETKPA LTWSKCTSSG CSDVKGSVVI DANWRWTHQT SGSTNCYTGN KWDTSICTDG KTCAEKCCLD GADYSGTYGI TSSGNQLSLG FVTNGPYSKN IGSRTYLMEN ENTYQMFQLL GNEFTFDVDV SGIGCGLNGA PHFVSMDEDG GKAKYSGNKA GAKYGTGYCD AQCPRDVKFI NGVANSEGWK PSDSDVNAGV GNLGTCCPEM DIWEANSIST AFTPHPCTKL TQHSCTGDSC GGTYSSDRYG GTCDADGCDF NAYRQGNKTF YGPGSNFNID TTKKMTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGNPGSSLTS DFCSKQKSVF GDIDDFSKKG GWNGMSDALS APMVLVMSLW HDHHSNMLWL DSTYPTDSTK VGSQRGSCAT TSGKPSDLER DVPNSKVSFS NIKFGPIGST YKSDGTTPNP PASSSTTGSS TPTNPPAGSV DQWGQCGGQN YSGPTTCKSP FTCKKINDFY SQCQ MYQRALLFSA LATAVSAQQV GTQKAEVHPA LTWQKCTAAG SCTDQKGSVV IDANWRWLHS TEDTTNCYTG NEWNAELCPD NEACAKNCAL DGADYSGTYG VTADGSSLKL NFVTSANVGS RLYLMEDDET YQMFNLLNNE FTFDVDVSNL PCGLNGALYF VSMDADGGLS KYPGNKAGAK YGTGYCDSQC PRDLKFINGE ANVEGWKPSD NDKNAGVGGY GSCCPEMDIW EANSISTAYT PHPCDGMEQT RCDGNDCGGT YSSTRYAGTC DPDGCDFNSF RMGNESFYGP GGLVDTKSPI TVVTQFVTAG GTDSGALKEI RRVYVQGGKV IGNSASNVAG VEGDSITSDF CTAQKKAFGD EDIFSKHGGL EGMGKALNKM ALIVSIWDDH ASSMMWLDST YPVDADASTP GVARGTCEHG LGDPETVESQ HPDASVTFSN IKFGPIGSTY KSV MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN TSTNCYTGNT WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSNLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSTNN SNTGIGNHGS CCAELDIWEA NSISEALTPH PCDTPGLTVC TADDCGGTYS SNRYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPFTV VTQFVTDDGT SSGSLSEIRR YYVQNGVVIP QPSSKISGIS GNVINSDFCA AELSAFGETA SFTNHGGLKN MGSALEAGMV LVMSLWDDYS VNMLWLDSTY PANETGTPGA ARGSCPTTSG NPKTVESQSG SSYVVFSDIK VGPFNSTFSG GTSTGGSTTT TASGTTSTKA STTSTSSTST GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL MRTAKFATLA ALVASAAAQQ ACSLTTERHP SLSWNKCTAG GQCQTVQASI TLDSNWRWTH QVSGSTNCYT GNKWDTSICT DAKSCAQNCC VDGADYTSTY GITTNGDSLS LKFVTKGQHS TNVGSRTYLM DGEDKYQTFE LLGNEFTFDV DVSNIGCGLN GALYFVSMDA DGGLSRYPGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WTGSTNDPNA GAGRYGTCCS EMDIWEANNM ATAFTPHPCT IIGQSRCEGD SCGGTYSNER YAGVCDPDGC DFNSYRQGNK TFYGKGMTVD TTKKITVVTQ FLKDANGDLG EIKRFYVQDG KIIPNSESTI PGVEGNSITQ DWCDRQKVAF GDIDDFNRKG GMKQMGKALA GPMVLVMSIW DDHASNMLWL DSTFPVDAAG KPGAERGACP TTSGVPAEVE AEAPNSNVVF SNIRFGPIGS TVAGLPGAGN GGNNGGNPPP PTTTTSSAPA TTTTASAGPK AGRWQQCGGI GFTGPTQCEE PYICTKLNDW YSQCL MMYKKFAALA ALVAGASAQQ ACSLTAENHP SLTWKRCTSG GSCSTVNGAV TIDANWRWTH TVSGSTNCYT GNQWDTSLCT DGKSCAQTCC VDGADYSSTY GITTSGDSLN LKFVTKHQYG TNVGSRVYLM ENDTKYQMFE LLGNEFTFDV DVSNLGCGLN GALYFVSMDA DGGMSKYSGN KAGAKYGTGY CDAQCPRDLK FINGEANVGN WTPSTNDANA GFGRYGSCCS EMDVWEANNM ATAFTPHPCT TVGQSRCEAD TCGGTYSSDR YAGVCDPDGC DFNAYRQGDK TFYGKGMTVD TNKKMTVVTQ FHKNSAGVLS EIKRFYVQDG KIIANAESKI PGNPGNSITQ EYCDAQKVAF SNTDDFNRKG GMAQMSKALA GPMVLVMSVW DDHYANMLWL DSTYPIDQAG APGAERGACP TTSGVPAEIE AQVPNSNVIF SNIRFGPIGS TVPGLDGSNP GNPTTTVVPP ASTSTSRPTS STSSPVSTPT GQPGGCTTQK WGQCGGIGYT GCTNCVAGTT CTQLNPWYSQ CL MASLSLSKIC RNALILSSVL STAQGQQVGT YQTETHPSMT WQTCGNGGSC STNQGSVVLD ANWRWVHQTG SSSNCYTGNK WDTSYCSTND ACAQKCALDG ADYSNTYGIT TSGSEVRLNF VTSNSNGKNV GSRVYMMADD THYEVYKLLN QEFTFDVDVS KLPCGLNGAL YFVVMDADGG VSKYPNNKAG AKYGTGYCDS QCPRDLKFIQ GQANVEGWVS STNNANTGTG NHGSCCAELD IWESNSISQA LTPHPCDTPT NTLCTGDACG GTYSSDRYSG TCDPDGCDFN PYRVGNTTFY GPGKTIDTNK PITVVTQFIT DDGTSSGTLS EIKRFYVQDG VTYPQPSADV SGLSGNTINS EYCTAENTLF EGSGSFAKHG GLAGMGEAMS TGMVLVMSLW DDYYANMLWL DSNYPTNEST SKPGVARGTC STSSGVPSEV EASNPSAYVA YSNIKVGPIG STFKS MYRAIATASA LIAAVRAQQV CSLTPETKPA LSWSKCTSSG CSNVQGSVTI DANWRWTHQL SGSTNCYTGN KWDTSICTSG KVCAEKCCID GAEYASTYGI TSSGNQLSLS FVTKGAYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSKSDVNAGI GNMGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGSSLTP EFCTAQKKVF GDTDDFAKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGAQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKEGVPEPTN PTNPTNPTNP TNPGTVDQWA QCGGTNYSGP TACKSPFTCK KINDFYSQCQ MFPKSSLLVL SFLATAYAQQ VGTQTAEVHP SLNWARCTSS GCTNVAGSVT LDANWRWLHT TSGYTNCYTG NSWNTTLCPD GATCAQNCAL DGANYQSTCG ITTSGNALTL KFVTQGEQKN IGSRVYLMAS ESRYEMFGLL NKEFTFDVDV SNLPCGLNGA LYFSSMDADG GMAKNPGNKA GAKYGTGYCD SQCPRDIKFI NGEANVAGWN GSPNDTNAGT GNWGACCNEM DIWEANSISA AYTPHPCTVQ GLSRCSGTAC GTNDRYGTVC DPDGCDFNSY RMGDKTYYGP GGTGVDTRSK FTVVTQFLTN NNSSSGTLSE IRRLYVQNGR VVQNSKVNIP GMSNTLDSIT TGFCDSQKTA FGDTRSFQNK GGMSAMGQAL GAGMVLVLSV WDDHAANMLW LDSNYPVDAD PSKPGIARGT CSTTSGKPTD VEQSAANSSV TFSNIKFGDI GTTYTGGSVT TTPGNPGTTT STAPGAVQTK WGQCGGQGWT GPTRCESGST CTVVNQWYSQ CI MFRKAALLAF SFLAIAHGQQ VGTNQAENHP SLPSQHCTAS GCTTSSTSVV LDANWRWVHT TTGYTNCYTG QTWDASICPD GVTCAKACAL DGADYSGTYG ITTSGNALTL QFVKGTNVGS RVYLLQDASN YQLFKLINQE FTFDVDMSNL PCGLNGAVYL SQMDQDGGVS RFPTNTAGAK YGTGYCDSQC PRDIKFINGE ANVAGWTGSS SDPNSGTGNY GTCCSEMDIW EANSVAAAYT PHPCSVNQQT RCTGADCGQD ANRYKGVCDP DGCDFNSFRM GDQTFLGKGL TVDTSRKFTI VTQFISDDGT SSGNLAEIRR FYVQDGKVIP NSKVNIAGCD AVNSITDKFC TQQKTAFGDT NRFADQGGLK QMGAALKSGM VLALSLWDDH AANMLWLDSD YPTTADASKP GVARGTCPNT SGVPKDVESQ SGSATVTYSN IKWGDLNSTF SGTASNPTGP SSSPSGPSSS SSSTAGSQPT QPSSGSVAQW GQCGGIGYSG ATGCVSPYTC HVVNPYYSQC Y TETHPRLTWK RCTSGGNCST VNGAVTIDAN WRWTHTVSGS TNCYTGNEWD TSICSDGKSC AQTCCVDGAD YSSTYGITTS GDSLNLKFVT KHQHGTNVGS RVYLMENDTK YQMFELLGNE FTFDVDVSNL GCGLNGALYF VSMDADGGMS KYSGNKAGAK YGTGYCDAQC PRDLKFINGE ANIENWTPST NDANAGFGRY GSCCSEMDIW EANNMATAFT PHPCTIIGQS RCEGNSCGGT YSSERYAGVC DPDGCDFNAY RQGDKTFYGK GMTVDTTKKM TVVTQFHKNS AGVLSEIKRF YVQDGKIIAN AESKIPGNPG NSITQEWCDA QKVAFGDIDD FNRKGGMAQM SKALEGPMVL VMSVWDDHYA NMLWLDSTYP IDKAGTPGAE RGACPTTSGV PAEIEAQVPN SNVIFSNIRF GPIGSTVPGL DGSTPSNPTA TVAPPTSTTT SVRSSTTQIS TPTSQPGGCT TQKWGQCGGI GYTGCTNCVA GTTCTELNPW YSQCL MFHKAVLVAF SLVTIVHGQQ AGTQTAENHP QLSSQKCTAG GSCTSASTSV VLDSNWRWVH TTSGYTNCYT GNTWDASICS DPVSCAQNCA LDGADYAGTY GITTSGDALT LKFVTGSNVG SRVYLMEDET NYQMFKLMNQ EFTFDVDVSN LPCGLNGAVY FVQMDQDGGT SKFPNNKAGA KFGTGYCDSQ CPQDIKFING EANIVDWTAS AGDANSGTGS FGTCCQEMDI WEANSISAAY TPHPCTVTEQ TRCSGSDCGQ GSDRFNGICD PDGCDFNSFR MGNTEFYGKG LTVDTSQKFT IVTQFISDDG TADGNLAEIR RFYVQNGKVI PNSVVQITGI DPVNSITEDF CTQQKTVFGD TNNFAAKGGL KQMGEAVKNG MVLALSLWDD YAAQMLWLDS DYPTTADPSQ PGVARGTCPT TSGVPSQVEG QEGSSSVIYS NIKFGDLNST FTGTLTNPSS PAGPPVTSSP SEPSQSTQPS QPAQPTQPAG TAAQWAQCGG MGFTGPTVCA SPFTCHVLNP YYSQCY MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT GNEWNTSLCP DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDHGDGCD FNSFRMGDKT FLGKGMTVDT SKPFTDVTQF LTNDNTSTGT LSEIRRIYIQ NGKVIQNSVA NIPGVDPVNS ITDNFCAQQK TAFGDTNWFA QKGGLKQMGE ALGNGMVLAL SIWDDHAANM LWLDSDYPTD KDPSAPGVAR GTCATTSGVP SDVESQVPNS QVVFSNIKFG DIGSTFSGTS SPNPPGGSTT SSPVTTSPTP PPTGPTVPQW GQCGGIGYSG STTCASPYTC HVLNPYYSQC Y MMMKQYLQYL AAALPLVGLA AGQRAGNETP ENHPPLTWQR CTAPGNCQTV NAEVVIDANW RWLHDDNMQN CYDGNQWTNA CSTATDCAEK CMIEGAGDYL GTYGASTSGD ALTLKFVTKH EYGTNVGSRF YLMNGPDKYQ MFNLMGNELA FDVDLSTVEC GINSALYFVA MEEDGGMASY PSNQAGARYG TGYCDAQCAR DLKFVGGKAN IEGWKSSTSD PNAGVGPYGS CCAEIDVWES NAYAFAFTPH ACTTNEYHVC ETTNCGGTYS EDRFAGKCDA NGCDYNPYRM GNPDFYGKGK TLDTSRKFTV VSRFEENKLS QYFIQDGRKI EIPPPTWEGM PNSSEITPEL CSTMFDVFND RNRFEEVGGF EQLNNALRVP MVLVMSIWDD HYANMLWLDS IYPPEKEGQP GAARGDCPTD SGVPAEVEAQ FPDAQVVWSN IRFGPIGSTY DF MYRSATFLTF ASLVLGQQVG TYTAERHPSM PIQVCTAPGQ CTRESTEVVL DANWRWTHIT NGYTNCYTGN EWNATACPDG ATCAKNCAVD GADYSGTYGI TTPSSGALRL QFVKKNDNGQ NVGSRVYLMA SSDKYKLFNL LNKEFTFDVD VSKLPCGLNG AVYFSEMLED GGLKSFSGNK AGAKYGTGYC DSQCPQDIKF INGEANVEGW GGADGNSGTG KYGICCAEMD IWEANSDATA YTPHVCSVNE QTRCEGVDCG AGSDRYNSIC DKDGCDFNSY RLGNREFYGP GKTVDTTRPF TIVTQFVTDD GTDSGNLKSI HRYYVQDGNV IPNSVTEVAG VDQTNFISEG FCEQQKSAFG DNNYFGQLGG MRAMGESLKK MVLVLSIWDD HAVNMNWLDS IFPNDADPEQ PGVARGRCDP ADGVPATIEA AHPDAYVIYS NIKFGAINST FTAN MYRTLAFASL SLYGAARAQQ VGTSTAENHP KLTWQTCTGT GGTNCSNKSG SVVLDSNWRW AHNVGGYTNC YTGNSWSTQY CPDGDSCTKN CAIDGADYSG TYGITTSNNA LSLKFVTKGS FSSNIGSRTY LMETDTKYQM FNLINKEFTF DVDVSKLPCG LNGALYFVEM AADGGIGKGN NKAGAKYGTG YCDSQCPHDI KFINGKANVE GWNPSDADPN GGAGKIGACC PEMDIWEANS ISTAYTPHPC RGVGLQECSD AASCGDGSNR YDGQCDKDGC DFNSYRMGVK DFYGPGATLD TTKKMTVITQ FLGSGSSLSE IKRFYVQNGK VYKNSQSAVA GVTGNSITES FCTAQKKAFG DTSSFAALGG LNEMGASLAR GHVLIMSLWG DHAVNMLWLD STYPTDADPS KPGAARGTCP TTSGKPEDVE KNSPDATVVF SNIKFGPIGS TFAQPA MYQKLALISA FLATARAQSA CTLQAETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSADSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICDGDSC GGTYSGDRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGDYSG NSLDDDYCAA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQLESNSPN AKVVYSNIKF GPIGSTGNSS GGNPPGGNPP GTTTTRRPAT STGSSPGPTQ THYGQCGGIG YSGPTVCASG STCQVLNPYY SQCL MVDSFSIYKT ALLLSMLATS NAQQVGTYTA ETHPSLTWQT CSGSGSCTTT SGSVVIDANW RWVHEVGGYT NCYSGNTWDS SICSTDTTCA SECALEGATY ESTYGVTTSG SSLRLNFVTT ASQKNIGSRL YLLADDSTYE TFKLFNREFT FDVDVSNLPC GLNGALYFVS MDADGGVSRF PTNKAGAKYG TGYCDSQCPR DLKFIDGQAN IEGWEPSSTD VNAGTGNHGS CCPEMDIWEA NSISSAFTAH PCDSVQQTMC TGDTCGGTYS DTTDRYSGTC DPDGCDFNPY RFGNTNFYGP GKTVDNSKPF TVVTQFITHD GTDTGTLTEI RRLYVQNGVV IGNGPSTYTA ASGNSITESF CKAEKTLFGD TNVFETHGGL SAMGDALGDG MVLVLSLWDD HAADMLWLDS DYPTTSCASS PGVARGTCPT TTGNATYVEA NYPNSYVTYS NIKFGTLNST YSGTSSGGSS SSSTTLTTKA STSTTSSKTT TTTSKTSTTS SSSTNVAQLY GQCGGQGWTG PTTCASGTCTKQNDYYSQCL MYRILKSFIL LSLVNMSLSQ KIGKLTPEVH PPMTFQKCSE GGSCETIQGE VVVDANWRWV HSAQGQNCYT GNTWNPTICP DDETCAENCY LDGANYESVY GVTTSEDSVR LNFVTQSQGK NIGSRLFLMS NESNYQLFHV LGQEFTFDVD VSNLDCGLNG ALYLVSMDSD GGSARFPTNE AGAKYGTGYC DAQCPRDLKF ISGSANVDGW IPSTNNPNTG YGNLGSCCAE MDLWEANNMA TAVTPHPCDT SSQSVCKSDS CGGAASSNRY GGICDPDGCD YNPYRMGNTS FFGPNKMIDT NSVITVVTQF ITDDGSSDGK LTSIKRLYVQ DGNVISQSVS TIDGVEGNEV NEEFCTNQKK VFGDEDSFTK HGGLAKMGEA LKDGMVLVLS LWDDYQANML WLDSSYPTTS SPTDPGVARG SCPTTSGVPS KVEQNYPNAY VVYSNIKVGP IDSTYKK MISRVLAISS LLAAARAQQI GTNTAEVHPA LTSIVIDANW RWLHTTSGYT NCYTGNSWDA TLCPDAVTCA ANCALDGADY SGTYGITTSG NSLKLNFVTK GANTNVGSRT YLMAAGSKTQ YQLLKLLGQE FTFDVDVSNL PCGLNGALYF AEMDADGGVS RFPTNKAGAQ YGTGYCDAQC PQDIKFINGQ ANSVGWTPSS NDVNTGTGQY GSCCSEMDIW EANKISAAYT PHPCSVDGQT RCTGTDCGIG ARYSSLCDAD GCDFNSYRMG DTGFYGAGLT VDTSKVFTVV TQFITNDGTT SGTLSEIRRF YVQNGKVIPN SQSKVTGVSG NSITDSFCAA QKTAFGDTNE FATKGGLATM SKALAKGMVL VMSIWDDHSA NMLWLDAPYP ASKSPSAAGV SRGSCSASSG VPADVEANSP GASVTYSNIK WGPINSTYSA GTGSNTGSGS GSTTTLVSSV PSSTPTSTTG VPKYGQCGGS GYTGPTNCIG STCVSMGQYY SQCQ MYRQVATALS FASLVLGQQV GTLTAETHPS LPIEVCTAPG SCTKEDTTVV LDANWRWTHV TDGYTNCYTG NAWNETACPD GKTCAANCAI DGAEYEKTYG ITTPEEGALR LNFVTESNVG SRVYLMAGED KYRLFNLLNK EFTMDVDVSN LPCGLNGAVY FSEMDEDGGM SRFEGNKAGA KYGTGYCDSQ CPRDIKFING EANSEGWGGE DGNSGTGKYG TCCAEMDIWE ANLDATAYTP HPCKVTEQTR CEDDTECGAG DARYEGLCDR DGCDFNSFRL GNKEFYGPEK TVDTSKPFTL VTQFVTADGT DTGALQSIRR FYVQDGTVIP NSETVVEGVD PTNEITDDFC AQQKTAFGDN NHFKTIGGLP AMGKSLEKMV LVLSIWDDHA VYMNWLDSNY PTDADPTKPG VARGRCDPEA GVPETVEAAH PDAYVIYSNI KIGALNSTFA AA MSSFQVYRAA LLLSILATAN AQQVGTYTTE THPSLTWQTC TSDGSCTTND GEVVIDANWR WVHSTSSATN CYTGNEWDTS ICTDDVTCAA NCALDGATYE ATYGVTTSGS ELRLNFVTQG SSKNIGSRLY LMSDDSNYEL FKLLGQEFTF DVDVSNLPCG LNGALYFVAM DADGGTSEYS GNKAGAKYGT GYCDSQCPRD LKFINGEANC DGWEPSSNNV NTGVGDHGSC CAEMDVWEAN SISNAFTAHP CDSVSQTMCD GDSCGGTYSA SGDRYSGTCD PDGCDYNPYR LGNTDFYGPG LTVDTNSPFT VVTQFITDDG TSSGTLTEIK RLYVQNGEVI ANGASTYSSV NGSSITSAFC ESEKTLFGDE NVFDKHGGLE GMGEAMAKGM VLVLSLWDDY AADMLWLDSD YPVNSSASTP GVARGTCSTD SGVPATVEAE SPNAYVTYSN IKFGPIGSTY SSGSSSGSGS SSSSSSTTTK ATSTTLKTTS TTSSGSSSTS AAQAYGQCGG QGWTGPTTCV SGYTCTYENA YYSQCL MYRAIATASA LLATARAQQV CTLNTENKPA LTWAKCTSSG CSNVRGSVVV DANWRWAHST SSSTNCYTGN TWDKTLCPDG KTCADKCCLD GADYSGTYGV TSSGNQLNLK FVTVGPYSTN VGSRLYLMED ENNYQMFDLL GNEFTFDVDV NNIGCGLNGA LYFVSMDKDG GKSRFSTNKA GAKYGTGYCD AQCPRDVKFI NGVANSDEWK PSDSDKNAGV GKYGTCCPEM DIWEANKIST AYTPHPCKSL TQQSCEGDAC GGTYSATRYA GTCDPDGCDF NPYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FIKGSDGKLS EIKRLYVQNG KVIGNPQSEI ANNPGSSVTD SFCKAQKVAF NDPDDFNKKG GWSGMSDALA KPMVLVMSLW HDHYANMLWL DSTYPKGSKT PGSARGSCPE DSGDPDTLEK EVPNSGVSFS NIKFGPIGST YTGTGGSNPD PEEPEEPEEP VGTVPQYGQC GGINYSGPTA CVSPYKCNKI NDFYSQCQ EQAGTATAEN HPPLTWQECT APGSCTTQNG AVVLDANWRW VHDVNGYTNC YTGNTWDPTY CPDDETCAQN CALDGADYEG TYGVTSSGSS LKLNFVTGSN VGSRLYLLQD DSTYQIFKLL NREFSFDVDV SNLPCGLNGA LYFVAMDADG GVSKYPNNKA GAKYGTGYCD SQCPRDLKFI DGEANVEGWQ PSSNNANTGI GDHGSCCAEM DVWEANSISN AVTPHPCDTP GQTMCSGDDC GGTYSNDRYA GTCDPDGCDF NPYRMGNTSF YGPGKIIDTT KPFTVVTQFL TDDGTDTGTL SEIKRFYIQN SNVIPQPNSD ISGVTGNSIT TEFCTAQKQA FGDTDDFSQH GGLAKMGAAM QQGMVLVMSL WDDYAAQMLW LDSDYPTDAD PTTPGIARGT CPTDSGVPSD VESQSPNSYV TYSNIKFGPI NSTFTAS MFPTLALVSL SFLAIAYGQQ VGTLTAETHP KLSVSQCTAG GSCTTVQRSV VLDSNWRWLH DVGGSTNCYT GNTWDDSLCP DPTTCAANCA LDGADYSGTY GITTSGNALS LKFVTQGPYS TNIGSRVYLL SEDDSTYEMF NLKNQEFTFD VDMSALPCGL NGALYFVEMD KDGGSGRFPT NKAGSKYGTG YCDTQCPHDI KFINGEANVL DWAGSSNDPN AGTGHYGTCC NEMDIWEANS MGAAVTPHVC TVQGQTRCEG TDCGDGDERY DGICDKDGCD FNSWRMGDQT FLGPGKTVDT SSKFTVVTQF ITADNTTSGD LSEIRRLYVQ NGKVIANSKT QIAGMDAYDS ITDDFCNAQK TTFGDTNTFE QMGGLATMGD AFETGMVLVM SIWDDHEAKM LWLDSDYPTD ADASAPGVSR GPCPTTSGDP TDVESQSPGA TVIFSNIKTG PIGSTFTS MLSASKAAAI LAFCAHTASA WVVGDQQTET HPKLNWQRCT GKGRSSCTNV NGEVVIDANW RWLAHRSGYT NCYTGSEWNQ SACPNNEACT KNCAIEGSDY AGTYGITTSG NQMNIKFITK RPYSTNIGAR TYLMKDEQNY EMFQLIGNEF TFDVDLSQRC GMNGALYFVS MPQKGQGAPG AKYGTGYCDA QCARDLKFVR GSANAEGWTK SASDPNSGVG KKGACCAQMD VWEANSAATA LTPHSCQPAG YSVCEDTNCG GTYSEDRYAG TCDANGCDFN PFRVGVKDFY GKGKTVDTTK KMTVVTQFVG SGNQLSEIKR FYVQDGKVIA NPEPTIPGME WCNTQKKVFQ EEAYPFNEFG GMASMSEGMS QGMVLVMSLW DDHYANMLWL DSNWPREADP AKPGVARRDC PTSGGKPSEV EAANPNAQVM FSNIKFGPIG STFAHAA MFRTATLLAF TMAAMVFGQQ VGTNTARSHP ALTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT GNQWDATLCP DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN GKVIQNSSVK IPGIDPVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG IGYTGSTTCA SPYTCHVLNP YYSQCY MYQRALLFSA LMAGVSAQQV GTQKPETHPP LAWKECTSSG CTSKDGSVVI DANWRWVHSV DGYKNCYTGN EWDSTLCPDD ATCATNCAVD GADYAGTYGA TTEGDSLSIN FVTGSNIGSR FYLMEDENKY QMFKLLNKEF TFDVDVSTLP CGLNGALYFV SMDADGGMSK YETNKAGAKY GTGYCDSQCP RDLKFINGKG NVEGWKPSAN DKNAGVGPHG SCCAEMDIWE ANSISTALTP HPCDTNGQTI CEGDSCGGTY STTRYAGTCD PDGCDFNPFR MGNESFYGPG KMVDTKSKMT VVTQFITSDG TDTGSLKEIK RVYVQNGKVI ANSASDVSGI TGNSITSDFC TAQKKTFGDE DVFNKHGGLS GMGDALGEGM VLVMSLWDDH NSNMLWLDGE KYPTDAAASK AGVSRGTCST DSGKPSTVES ESGSAKVVFS NIKVGSIGST FSA MTSKIALASL FAAAYGQQIG TYTTETHPSL TWQSCTAKGS CTTQSGSIVL DGNWRWTHST TSSTNCYTGN TWDATLCPDD ATCAQNCALD GADYSGTYGI TTSGDSLRLN FVTQTANKNV GSRVYLLADN THYKTFNLLN QEFTFDVDVS NLPCGLNGAV YFANLPADGG ISSTNKAGAQ YGTGYCDSQC PRDGKFINGK ANVDGWVPSS NNPNTGVGNY GSCCAEMDIW EANSISTAVT PHSCDTVTQT VCTGDNCGGT YSTTRYAGTC DPDGCDFNPY RQGNESFYGP GKTVDTNSVF TIVTQFLTTD GTSSGTLNEI KRFYVQNGKV IPNSESTISG VTGNSITTPF CTAQKTAFGD PTSFSDHGGL ASMSAAFEAG MVLVLSLWDD YYANMLWLDS TYPTTKTGAG GPRGTCSTSS GVPASVEASS PNAYVVYSNI KVGAINSTFG MYTKFAALAA LVATVRGQAA CSLTAETHPS LQWQKCTAPG SCTTVSGQVT IDANWRWLHQ TNSSTNCYTG NEWDTSICSS DTDCATKCCL DGADYTGTYG VTASGNSLNL KFVTQGPYSK NIGSRMYLME SESKYQGFTL LGQEFTFDVD VSNLGCGLNG ALYFVSMDLD GGVSKYTTNK AGAKYGTGYC DSQCPRDLKF INGQANIDGW QPSSNDANAG LGNHGSCCSE MDIWEANKVS AAYTPHPCTT IGQTMCTGDD CGGTYSSDRY AGICDPDGCD FNSYRMGDTS FYGPGKTVDT GSKFTVVTQF LTGSDGNLSE IKRFYVQNGK VIPNSESKIA GVSGNSITTD FCTAQKTAFG DTNVFEERGG LAQMGKALAE PMVLVLSVWD DHAVNMLWLD STYPTDSTKP GAARGDCPIT SGVPADVESQ APNSNVIYSN IRFGPINSTY TGTPSGGNPP GGGTTTTTTT TTSKPSGPTT TTNPSGPQQT HWGQCGGQGW TGPTVCQSPY TCKYSNDWYS QCL MYQRALLFSA LLSVSRAQQA GTAQEEVHPS LTWQRCEASG SCTEVAGSVV LDSNWRWTHS VDGYTNCYTG NEWDATLCPD NESCAQNCAV DGADYEATYG ITSNGDSLTL KFVTGSNVGS RVYLMEDDET YQMFDLLNNE FTFDVDVSNF PCGLNGALYF TSMDADGGLS KYEGNTAGAK YGTGYCDSQC PRDIKFINGL GNVEGWEPSD SDANAGVGGM GTCCPEMDIW EANSISTAYT PHPCDSVEQT MCEGDSCGGT YSDDRYGGTC DPDGCDFNSY RMGNTRFYGP GAIIDTSSKF TVVTQFIADG GSLSEIKRFY VQNGEVIPNS ESNISGVEGN SITSEFCTAQ KTAFGDEDIF AQHGGLSAMG DAASAMVLIL SIWDDHHSSM MWLDSSYPTD ADPSQPGVAR GTCEQGAGDP DVVESEHADA SVTFSNIKFG PIGSTF MMMKQYLQYL AAGSLMTGLV AGQGVGTQQT ETHPRITWKR CTGKANCTTV QAEVVIDSNW RWIHTSGGTN CYDGNAWNTA ACSTATDCAS KCLMEGAGNY QQTYGASTSG DSLTLKFVTK HEYGTNVGSR FYLMNGASKY QMFTLMNNEF TFDVDLSTVE CGLNSALYFV AMEEDGGMRS YPTNKAGAKY GTGYCDAQCA RDLKFVGGKA NIEGWRESSN DENAGVGPYG GCCAEIDVWE SNAHAYAFTP HACENNNYHV CERDTCGGTY SEDRFAGGCD ANGCDYNPYR MGNPDFYGKG KTVDTTKKFT VVTRFQDDNL EQFFVQNGQK ILAPAPTFDG IPASPNLTPE FCSTQFDVFT DRNRFREVGD FPQLNAALRI PMVLVMSIWA DHYANMLWLD SVYPPEKEGE PGAARGPCAQ DSGVPSEVKA NYPNAKVVWS NIRFGPIGST VNV MYQRALLFSF FLAAARAQQA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA LYFVAMDADG GLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSCCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF NPYRQGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW LDSTYPTDAD PDTPGVARGT CPTTSGVPAD VESQNPNSYV IYSNIKVGPI NSTFTAN MFAIVLLGLT RSLGTGTNQA ENHPSLSWQN CRSGGSCTQT SGSVVLDSNW RWTHDSSLTN CYDGNEWSSS LCPDPKTCSD NCLIDGADYS GTYGITSSGN SLKLVFVTNG PYSTNIGSRV YLLKDESHYQ IFDLKNKEFT FTVDDSNLDC GLNGALYFVS MDEDGGTSRF SSNKAGAKYG TGYCDAQCPH DIKFINGEAN VENWKPQTND ENAGNGRYGA CCTEMDIWEA NKYATAYTPH ICTVNGEYRC DGSECGDTDS GNRYGGVCDK DGCDFNSYRM GNTSFWGPGL IIDTGKPVTV VTQFVTKDGT DNGQLSEIRR KYVQGGKVIE NTVVNIAGMS SGNSITDDFC NEQKSAFGDT NDFEKKGGLS GLGKAFDYGM VLVLSLWDDH QVNMLWLDSI YPTDQPASQP GVKRGPCATS SGAPSDVESQ HPDSSVTFSD IRFGPIDSTY MHQRALLFSA LVGAVRAQQA GTLTEEVHPP LTWQKCTADG SCTEQSGSVV IDSNWRWLHS TNGSTNCYTG NTWDESLCPD NEACAANCAL DGADYESTYG ITTSGDALTL TFVTGENVGS RVYLMAEDDE SYQTFDLVGN EFTFDVDVSN LPCGLNGALY FTSMDADGGV SKYPANKAGA KYGTGYCDSQ CPRDLKFING MANVEGWTPS DNDKNAGVGG HGSCCPELDI WEANSISSAF TPHPCDDLGQ TMCSGDDCGG TYSETRYAGT CDPDGCDFNA YRMGNTSYYG PDKIVDTNSV MTVVTQFIGD GGSLSEIKRL YVQNGKVIAN AQSNVDGVTG NSITSDFCTA QKTAFGDQDI FSKHGGLSGM GDAMSAMVLI LSIWDDHNSS MMWLDSTYPE DADASEPGVA RGTCEHGVGD PETVESQHPG ATVTFSKIKF GPIGSTYSSN STA MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT GNEWDTSLCP DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDGDGCDF NSFRMGDKTF LGKGMTVDTS KPFTVVTQFL TNDNTSTGTL SEIRRIYIQN GKVIQNSVAN IPGVDPVNSI TDNFCAQQKT AFGDTNWFAQ KGGLKQMGEA LGNGMVLALS IWDDHAANML WLDSDYPTDK DPSAPGVARG TCATTSGVPS DVESQVPNSQ VVFSNIKFGD IGSTFSGTSS PNPPGGSTTS SPVTTSPTPP PTGPTVPQWG QCGGIGYSGS TTCASPYTCH VLNPCESILS LQRSSNADQY LQTTRSATKR RLDTALQPRK MRTALALILA LAAFSAVSAQ QAGTITAETH PTLTIQQCTQ SGGCAPLTTK VVLDVNWRWI HSTTGYTNCY SGNTWDAILC PDPVTCAANC ALDGADYTGT FGILPSGTSV TLRPVDGLGL RLFLLADDSH YQMFQLLNKE FTFDVEMPNM RCGSSGAIHL TAMDADGGLA KYPGNQAGAK YGTGFCSAQC PKGVKFINGQ ANVEGWLGTT ATTGTGFFGS CCTDIALWEA NDNSASFAPH PCTTNSQTRC SGSDCTADSG LCDADGCNFN SFRMGNTTFF GAGMSVDTTK LFTVVTQFIT SDNTSMGALV EIHRLYIQNG QVIQNSVVNI PGINPATSIT DDLCAQENAA FGGTSSFAQH GGLAQVGEAL RSGMVLALSI VNSAADTLWL DSNYPADADP SAPGVARGTC PQDSASIPEA PTPSVVFSNI KLGDIGTTFG AGSALFSGRS PPGPVPGSAP ASSATATAPP FGSQCGGLGY AGPTGVCPSP YTCQALNIYY SQCI MYQRALLFSF FLAAARAHEA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA LYFVAMDADG NLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSSCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDTDGCDF NPYQPGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FDNTGFFTHG GLQKISQALA QGMVLVMSLW DDHAANMLWL DSTYPTDADP DTPGVARGTC PTTSGVPADV ESQNPNSYVI YSNIKVGPIN STFTAN MHKRAATLSA LVVAAAGFAR GQGVGTQQTE THPKLTFQKC SAAGSCTTQN GEVVIDANWR WVHDKNGYTN CYTGNEWNTT ICADAASCAS NCVVDGADYQ GTYGASTSGN ALTLKFVTKG SYATNIGSRM YLMASPTKYA MFTLLGHEFA FDVDLSKLPC GLNGAVYFVS MDEDGGTSKY PSNKAGAKYG TGYCDSQCPR DLKFIDGKAN SASWQPSSND QNAGVGGMGS CCAEMDIWEA NSVSAAYTPH PCQNYQQHSC SGDDCGGTYS ATRFAGDCDP DGCDWNAYRM GVHDFYGNGK TVDTGKKFSI VTQFKGSGST LTEIKQFYVQ DGRKIENPNA TWPGLEPFNS ITPDFCKAQK QVFGDPDRFN DMGGFTNMAK ALANPMVLVL SLWDDHYSNM LWLDSTYPTD ADPSAPGKGR GTCDTSSGVP SDVESKNGDA TVIYSNIKFG PLDSTYTAS MRASLLAFSL NSAAGQQAGT LQTKNHPSLT SQKCRQGGCP QVNTTIVLDA NWRWTHSTSG STNCYTGNTW QATLCPDGKT CAANCALDGA DYTGTYGVTT SGNSLTLQFV TQSNVGARLG YLMADDTTYQ MFNLLNQEFW FDVDMSNLPC GLNGALYFSA MARTAAWMPM VVCASTPLIS TRRSTARLLR LPVPPRSRYG RGICDSQCPR DIKFINGEAN VQGWQPSPND TNAGTGNYGA CCNKMDVWEA NSISTAYTPH PCTQRGLVRC SGTACGGGSN RYGSICDHDG LGFQNLFGMG RTRVRARVGR VKQFNRSSRV VEPISWTKQT TLHLGNLPWK SADCNVQNGR VIQNSKVNIP GMPSTMDSVT TEFCNAQKTA FNDTFSFQQK GGMANMSEAL RRGMVLVLSI WDDHAANMLW LDSITSAAAC RSTPSEVHAT PLRESQIRSS HSRQTRYVTF TNIKFGPFNS TGTTYTTGSV PTTSTSTGTT GSSTPPQPTG VTVPQGQCGG IGYTGPTTCA SPTTCHVLNP YYSQCY MKQYLQYLAA ALPLMSLVSA QGVGTSTSET HPKITWKKCS SGGSCSTVNA EVVIDANWRW LHNADSKNCY DGNEWTDACT SSDDCTSKCV LEGAEYGKTY GASTSGDSLS LKFLTKHEYG TNIGSRFYLM NGASKYQMFT LMNNEFAFDV DLSTVECGLN SALYFVAMEE DGGMASYSTN KAGAKYGTGY CDAQCARDLK FVGGKANYDG WTPSSNDANA GVGALGGCCA EIDVWESNAH AFAFTPHACE NNNYHVCEDT TCGGTYSEDR FAGDCDANGC DYNPYRVGNT DFYGKGMTVD TSKKFTVVSQ FQENKLTQFF VQNGKKIEIP GPKHEGLPTE SSDITPELCS AMPEVFGDRD RFAEVGGFDA LNKALAVPMV LVMSIWDDHY ANMLWLDSSY PPEKAGTPGG DRGPCAQDSG VPSEVESQYP DATVVWSNIR FGPIGSTVQV MFPKASLIAL SFIAAVYGQQ VGTQMAEVHP KLPSQLCTKS GCTNQNTAVV LDANWRWLHT TSGYTNCYTG NSWDATLCPD ATTCAQNCAV DGADYSGTYG ITTSGNALTL KFKTGTNVGS RVYLMQTDTA YQMFQLLNQE FTFDVDMSNL PCGLNGALYL SQMDQDGGLS KFPTNKAGAK YGTGYCDSQC PHDIKFINGM ANVAGWAGSA SDPNAGSGTL GTCCSEMDIW EANNDAAAFT PHPCSVDGQT QCSGTQCGDD DERYSGLCDK DGCDFNSFRM GDKSFLGKGM TVDTSRKFTV VTQFVTTDGT TNGDLHEIRR LYVQDGKVIQ NSVVSIPGID AVDSITDNFC AQQKSVFGDT NYFATLGGLK KMGAALKSGM VLAMSVWDDH AASMQWLDSN YPADGDATKP GVARGTCSAD SGLPTNVESQ SASASVTFSN IKWGDINTTF TGTGSTSPSS PAGPVSSSTS VASQPTQPAQ GTVAQWGQCG GTGFTGPTVC ASPFTCHVVN PYYSQCY MFRTAALLSF AYLAVVYGQQ AGTSTAETHP PLTWEQCTSG GSCTTQSSSV VLDSNWRWTH VVGGYTNCYT GNEWNTTVCP DGTTCAANCA LDGADYEGTY GISTSGNALT LKFVTASAQT NVGSRVYLMA PGSETEYQMF NPLNQEFTFD VDVSALPCGL NGALYFSEMD ADGGLSEYPT NKAGAKYGTG YCDSQCPRDI KFIEGKANVE GWTPSSTSPN AGTGGTGICC NEMDIWEANS ISEALTPHPC TAQGGTACTG DSCSSPNSTA GICDQAGCDF NSFRMGDTSF YGPGLTVDTT SKITVVTQFI TSDNTTTGDL TAIRRIYVQN GQVIQNSMSN IAGVTPTNEI TTDFCDQQKT AFGDTNTFSE KGGLTGMGAA FSRGMVLVLS IWDDDAAEML WLDSTYPVGK TGPGAARGTC ATTSGQPDQV ETQSPNAQVV FSNIKFGAIG STFSSTGTGT GTGTGTGTGT GTTTSSAPAA TQTKYGQCGG QGWTGATVCA SGSTCTSSGP YYSQCL MFRTAALTAF TFAAVVLGQQ VGTLTTENHP ALSIQQCTAT GCTTQQKSVV LDSNWRWTHS TAGATNCYTG NAWDPALCPD PATCATNCAI DGADYSGTYG ITTSGNALTL RFVTNGQYSQ NIGSRVYLLD DADHYKLFDL KNQEFTFDVD MSGLPCGLNG ALYFSEMAAD GGKAAHAGNN AGAKYGTGYC DAQCPHDIKW INGEANVLDW SASATDDNAG NGRYGACCAE MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGNNRYGG VCDKDGCDFN SYRMGDKNFL GRGKTIDTTK KVTVVTQFIT DNNTPTGNLV EIRRVYVQNG VVYQNSFSTF PSLSQYNSIS DEFCVAQKTL FGDNQYYNTH GGTTKMGDAF DNGMVLIMSL WSDHAAHMLW LDSDYPLDKS PSEPGVSRGA CPTSSGDPDD VVANHPNASV TFSNIKYGPI GSTFGGSTPP VSSGGSSVPP VTSTTSSGTT TPTGPTGTVP KWGQCGGIGY SGPTACVAGS TCTYSNDWYS QCL MYRAIATASA LIAAVRAQQV CSLTPETKPA LSWSKCTSSG CSNVQGSVTI DANWRWTHQL SGSTNCYTGN KWDTSICTSG KVCAEKCCID GAEYASTYGI TSSGNQLSLS FVTKGTYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSKSDVNGGI GNLGTCCPEM DIWEANSIST AHTPHPCTKL TQHSCTGDSC GGTYSEDRYG GTCDADGCDF NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGSSLTP EFCTAQKKVF GDIDDFEKKG AWGGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGAQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKEGQPEPTN PTNPNPTTPG GTVDQWGQCG GTNYSGPTAC KSPFTCKKIN DFYSQCQ MFRTATLLAF TMAAMVFGQQ VGTNTAENHR TLTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT GNQWDATLCP DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN GKVIQNSSVK IPGIDLVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG IGYTGSTTCA SPYTCHVLNP YYSQCY MYQTSLLASL SFLLATSQAQ QVGTQTAETH PKLTTQKCTT AGGCTDQSTS IVLDANWRWL HTVDGYTNCY TGQEWDTSIC TDGKTCAEKC ALDGADYEST YGISTSGNAL TMNFVTKSSQ TNIGGRVYLL AADSDDTYEL FKLKNQEFTF DVDVSNLPCG LNGALYFSEM DSDGGLSKYT TNKAGAKYGT GYCDTQCPHD IKFINGEANV QNWTASSTDK NAGTGHYGSC CNEMDIWEAN SQATAFTPHV CEAKVEGQYR CEGTECGDGD NRYGGVCDKD GCDFNSYRMG NETFYGSNGS TIDTTKKFTV VTQFITADNT ATGALTEIRR KYVQNDVVIE NSYADYETLS KFNSITDDFC AAQKTLSGDT NDFKTKGGIA RMGESFERGM VLVMSVWDDH AANALWLDSS YPTDADASKP GVKRGPCSTS SGVPSDVEAN DADSSVIYSN IRYGDIGSTF NKTA MFSKVALTAL CFLAVAQAQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY TGNAWNSSVC SDGATCAQRC ALEGANYQQT YGITTSGDAL TIKFLTRSEQ TNIGARVYLM ENEDRYQMFN LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGLSSQPNN RAGAKYGTGY CDSQCPRDIK FINGEANSVG WEPSETDPNA GKGQYGICCA EMDIWEANSI SNAYTPHPCQ TVNDGGYQRC QGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTVDT NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITQEFCDDA KRAFEDNDSF GRNGGLAHMG RSLAKGHVLA LSIWNDHTAH MLWLDSNYPT DADPNKPGIA RGTCPTTGGS PRDTEQNHPD AQVIFSNIKF GDIGSTFSGN MYRKLAVISA FLAAARAQQV CTQQAETHPP LTWQKCTASG CTPQQGSVVL DANWRWTHDT KSTTNCYDGN TWSSTLCPDD ATCAKNCCLD GANYSGTYGV TTSGDALTLQ FVTASNVGSR LYLMANDSTY QEFTLSGNEF SFDVDVSQLP CGLNGALYFV SMDADGGQSK YPGNAAGAKY GTGYCDSQCP RDLKFINGQA NVEGWEPSSN NANTGVGGHG SCCSEMDIWE ANSISEALTP HPCETVGQTM CSGDSCGGTY SNDRYGGTCD PDGCDWNPYR LGNTSFYGPG SSFALDTTKK LTVVTQFATD GSISRYYVQN GVKFQQPNAQ VGSYSGNTIN TDYCAAEQTA FGGTSFTDKG GLAQINKAFQ GGMVLVMSLW DDYAVNMLWL DSTYPTNATA STPGAKRGSC STSSGVPAQV EAQSPNSKVI YSNIRFGPIG STGGNTGSNP PGTSTTRAPP SSTGSSPTAT QTHYGQCGGT GWTGPTRCAS GYTCQVLNPF YSQCL MRASLLAFSL AAAVAGGQQA GTLTAKRHPS LTWQKCTRGG CPTLNTTMVL DANWRWTHAT SGSTKCYTGN KWQATLCPDG KSCAANCALD GADYTGTYGI TGSGWSLTLQ FVTDNVGARA YLMADDTQYQ MLELLNQELW FDVDMSNIPC GLNGALYLSA MDADGGMRKY PTNKAGAKYA TGYCDAQCPR DLKYINGIAN VEGWTPSTND ANGIGDHGSC CSEMDIWEAN KVSTAFTPHP CTTIEQHMCE GDSCGGTYSD DRYGVLCDAD GCDFNSYRMG NTTFYGEGKT VDTSSKFTVV TQFIKDSAGD LAEIKAFYVQ NGKVIENSQS NVDGVSGNSI TQSFCKSQKT AFGDIDDFNK KGGLKQMGKA LAQAMVLVMS IWDDHAANML WLDSTYPVPK VPGAYRGSGP TTSGVPAEVD ANAPNSKVAF SNIKFGHLGI SPFSGGSSGT PPSNPSSSAS PTSSTAKPSS TSTASNPSGT GAAHWAQCGG IGFSGPTTCP EPYTCAKDHD IYSQCV MLASTFSYRM YKTALILAAL LGSGQAQQVG TSQAEVHPSM TWQSCTAGGS CTTNNGKVVI DANWRWVHKV GDYTNCYTGN TWDKTLCPDD ATCASNCALE GANYQSTYGA TTSGDSLRLN FVTTSQQKNI GSRLYMMKDD TTYEMFKLLN QEFTFDVDVS NLPCGLNGAL YFVAMDADGG MSKYPTNKAG AKYGTGYCDS QCPRDLKFIN GQANVEGWQP SSNDANAGTG NHGSCCAEMD IWEANSISTA FTPHPCDTPG QVMCTGDACG GTYSSDRYGG TCDPDGCDFN SFRQGNKTFY GPGMTVDTKS KFTVVTQFIT DDGTASGTLK EIKRFYVQNG KVIPNSESTW SGVGGNSITN DYCTAQKSLF KDQNVFAKHG GMEGMGAALA QGMVLVMSLW DDHAANMLWL DSNYPTTASS STPGVARGTC DISSGVPADV EANHPDASVV YSNIKVGPIG STFNSGGSNP GGGTTTTAKP TTTTTTAGSP GGTGVAQHYG QCGGNGWQGP TTCASPYTCQ KLNDFYSQCL MQIKQYLQYL AAALPLVNMA AAQRAGTQQT ETHPRLSWKR CSSGGNCQTV NAEIVIDANW RWLHDSNYQN CYDGNRWTSA CSSATDCAQK CYLEGANYGS TYGVSTSGDA LTLKFVTKHE YGTNIGSRVY LMNGSDKYQM FTLMNNEFAF DVDLSKVECG LNSALYFVAM EEDGGMRSYS SNKAGAKYGT GYCDAQCARD LKFVGGKANI EGWRPSTNDA NAGVGPYGAC CAEIDVWESN AYAFAFTPHG CLNNNYHVCE TSNCGGTYSE DRFGGLCDAN GCDYNPYRMG NKDFYGKGKT VDTSRKFTVV TRFEENKLTQ FFIQDGRKID IPPPTWPGLP NSSAITPELC TNLSKVFDDR DRYEETGGFR TINEALRIPM VLVMSIWDGH YASMLWLDSV YPPEKAGQPG AERGPCAPTS GVPAEVEAQF PNAQVIWSNI RFGPIGSTYQ V MTSRIALVSL FAAVYGQQVG TYQTETHPSL TWQSCTAKGS CTTNTGSIVL DGNWRWTHGV GTSTNCYTGN TWDATLCPDD ATCAQNCALE GADYSGTYGI TTSGNSLRLN FVTQSANKNI GSRVYLMADT THYKTFNLLN QEFTFDVDVS NLPCGLNGAV YFANLPADGG ISSTNTAGAE YGTGYCDSQC PRDMKFIKGQ ANVDGWVPSS NNANTGVGNH GSCCAEMDIW EANSISTAVT PHSCDTVTQT VCTGDDCGGT YSSSRYAGTC DPDGCDFNSY RMGDETFYGP GKTVDTNSVF TVVTQFLTTD GTASGTLNEI KRFYVQDGKV IPNSYSTISG VSGNSITTPF CDAQKTAFGD PTSFSDHGGL ASMSAAFEAG MVLVLSLWDD YYANMLWLDS TYPVGKTSAG GPRGTCDTSS GVPASVEASS PNAYVVYSNI KVGAINSTYG MFVFVLLWLT QSLGTGTNQA ENHPSLSWQN CRSGGSCTQT SGSVVLDSNW RWTHDSSLTN CYDGNEWSSS LCPDPKTCSD NCLIDGADYS GTYGITSSGN SLKLVFVTNG PYSTNIGSRV YLLKDESHYQ IFDLKNKEFT FTVDDSNLDC GLNGALYFVS MDEDGGTSRF SSNKAGAKYG TGYCDAQCPH DIKFINGEAN VENWKPQTND ENAGNGRYGA CCTEMDIWEA NKYATAYTPH ICTVNGEYRC DGSECGDTDS GNRYGGVCDK DGCDFNSYRM GNTSFWGPGL IIDTGKPVTV VTQFVTKDGT DNGQLSEIRR KYVQGGKVIE NTVVNIAGMS SGNSITDDFC NEQKSAFGDT NDFEKKGGLS GLGKAFDYGM VLVLSLWDDH QVNMLWLDSI YPTDQPASQP GVKRGPCATS SGAPSDVESQ HPDSSVTFSD IRFGPIDSTY MFRKAALLAF SFLAIAHGQQ VGTNQAENHP SLPSQKCTAS GCTTSSTSVV LDANWRWVHT TTGYTNCYTG QTWDASICPD GVTCAKACAL DGADYSGTYG ITTSGNALTL QFVKGTNVGS RVYLLQDASN YQMFQLINQE FTFDVDMSNL PCGLNGAVYL SQMDQDGGVS RFPTNTAGAK YGTGYCDSQC PRDIKFINGE ANVEGWTGSS TDSNSGTGNY GTCCSEMDIW EANSVAAAYT PHPCSVNQQT RCTGADCGQG DDRYDGVCDP DGCDFNSFRM GDQTFLGKGL TVDTSRKFTI VTQFISDDGT TSGNLAEIRR FYVQDGNVIP NSKVSIAGID AVNSITDDFC TQQKTAFGDT NRFAAQGGLK QMGAALKSGM VLALSLWDDH AANMLWLDSD YPTTADASNP GVARGTCPTT SGFPRDVESQ SGSATVTYSN IKWGDLNSTF TGTLTTPSGS SSPSSPASTS GSSTSASSSA SVPTQSGTVA QWAQCGGIGY SGATTCVSPY TCHVVNAYYS QCY MYRAIATASA LIAAARAQQV CTLTTETKPA LTWSKCTSSG CTDVKGSVGI DANWRWTHQT SSSTNCYTGN KWDTSVCTSG ETCAQKCCLD GADYAGTYGI TSSGNQLSLG FVTKGSFSTN IGSRTYLMEN ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKARYPANKA GAKYGTGYCD AQCPRDVKFI NGKANSDGWK PSDSDINAGI GNMGTCCPEM DIWEANSIST AFTPHPCTKL TQHACTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF YGRGSDFNVD TTKKVTVVTQ FKKGSNGRLS EITRLYVQNG KVIANSESKI PGNSGSSLTA DFCSKQKSVF GDIDDFSKKG GWSGMSDALE SPPMVLVMSL WHDHHSNMLW LDSTYPTDST KLGAQRGSCA TTSGVPSDLE RDVPNSKVSF SNIKFGPIGS TYSSGTTNPP PSSTDTSTTP TNPPTGGTVG QYGQCGGQTY TGPKDCKSPY TCKKINDFYS QCQ MSSFQIYRAA LLLSILATAN AQQVGTYTTE THPSLTWQTC TSDGSCTTND GEVVIDANWR WVHSTSSATN CYTGNEWDTS ICTDDVTCAA NCALDGATYE ATYGVTTSGS ELRLNFVTQG SSKNIGSRLY LMSDDSNYEL FKLLGQEFTF DVDVSNLPCG LNGALYFVAM DADGGTSEYS GNKAGAKYGT GYCDSQCPRD LKFINGEANC DGWEPSSNNV NTGVGDHGSC CAEMDVWEAN SISNAFTAHP CDSVSQTMCD GDSCGGTYSA SGDRYSGTCD PDGCDYNPYR LGNTDFYGPG LTVDTNSPFT VVTQFITDDG TSSGTLTEIK RLYVQNGEVI ANGASTYSSV NGSSITSAFC ESEKTLFGDE NVFDKHGGLE GMGEAMAKGM VLVLSLWDDY AADMLWLDSD YPVNSSASTP GVARGTCSTD SGVPATVEAE SPNAYVTYSN IKFGPIGSTY SSGSSSGSGS SSSSSSTTTK ATSTTLKTTS TTSSGSSSTS AAQAYGQCGG QGWTGPTTCV SGYTCTYENA YYSQCL MHQRALLFSA LLTAVRAQQA GTLTEEVHPS LTWQKCTSEG SCTEQSGSVV IDSNWRWTHS VNDSTNCYTG NTWDATLCPD DETCAANCAL DGADYESTYG VTTDGDSLTL KFVTGSNVGS RLYLMDTSDE GYQTFNLLDA EFTFDVDVSN LPCGLNGALY FTAMDADGGV SKYPANKAGA KYGTGYCDSQ CPRDLKFIDG QANVDGWEPS SNNDNTGIGN HGSCCPEMDI WEANKISTAL TPHPCDSSEQ TMCEGNDCGG TYSDDRYGGT CDPDGCDFNP YRMGNDSFYG PGKTIDTGSK MTVVTQFITD GSGSLSEIKR YYVQNGNVIA NADSNISGVT GNSITTDFCT AQKKAFGDED IFAEHNGLAG ISDAMSSMVL ILSLWDDYYA SMEWLDSDYP ENATATDPGV ARGTCDSESG VPATVEGAHP DSSVTFSNIK FGPINSTFSA SA MYAKFATLAA LVAGAAAQNA CTLTAENHPS LTWSKCTSGG SCTSVQGSIT IDANWRWTHR TDSATNCYEG NKWDTSYCSD GPSCASKCCI DGADYSSTYG ITTSGNSLNL KFVTKGQYST NIGSRTYLME SDTKYQMFQL LGNEFTFDVD VSNLGCGLNG ALYFVSMDAD GGMSKYSGNK AGAKYGTGYC DSQCPRDLKF INGEANVENW QSSTNDANAG TGKYGSCCSE MDVWEANNMA AAFTPHPCXV IGQSRCEGDS CGGTYSTDRY AGICDPDGCD FNSYRQGNKT FYGKGMTVDT TKKITVVTQF LKNSAGELSE IKRFYVQNGK VIPNSESTIP GVEGNSITQD WCDRQKAAFG DVTDXQDKGG MVQMGKALAG PMVLVMSIWD DHAVNMLWLD STWPIDGAGK PGAERGACPT TSGVPAEVEA EAPNSNVIFS NIRFGPIGST VSGLPDGGSG NPNPPVSSST PVPSSSTTSS GSSGPTGGTG VAKHYEQCGG IGFTGPTQCE SPYTCTKLND WYSQCL MYAKFATLAA LVAGASAQAV CSLTAETHPS LTWQKCTAPG SCTNVAGSIT IDANWRWTHQ TSSATNCYSG SKWDSSICTT GTDCASKCCI DGAEYSSTYG ITTSGNALNL KFVTKGQYST NIGSRTYLME SDTKYQMFKL LGNEFTFDVD VSNLGCGLNG ALYFVSMDAD GGMSKYSGNK AGAKYGTGYC DAQCPRDLKF INGEANVEGW ESSTNDANAG SGKYGSCCTE MDVWEANNMA TAFTPHPCTT IGQTRCEGDT CGGTYSSDRY AGVCDPDGCD FNSYRQGNKT FYGKGMTVDT TKKITVVTQF LKNSAGELSE IKRFYAQDGK VIPNSESTIA GIPGNSITKA YCDAQKTVFQ NTDDFTAKGG LVQMGKALAG DMVLVMSVWD DHAVNMLWLD STYPTDQVGV AGAERGACPT TSGVPSDVEA NAPNSNVIFS NIRFGPIGST VQGLPSSGGT SSSSSAAPQS TSTKASTTTS AVRTTSTATT KTTSSAPAQG TNTAKHWQQC GGNGWTGPTV CESPYKCTKQ NDWYSQCL MLTLVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSKDLCP SSDTCSQKCY IEGADYSGTY GIQSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYESFK LKNKEFTFTV DDSKLNCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GNGKLGTCCS EMDIWEGNMK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNNGQ VLVMSLWDDH TANMLWLDST YPTDSSDSTA QRGPCPTSSG VPKDVESQHG DATVVFSDIK FGAINSTFKY N MLAAALFTFA CSVGVGTKTP ENHPKLNWQN CASKGSCSQV SGEVTMDSNW RWTHDGNGKN CYDGNTWISS LCPDDKTCSD KCVLDGAEYQ ATYGIQSNGT ALTLKFVTHG SYSTNIGSRL YLLKDKSTYY VFKLNNKEFT FSVDVSKLPC GLNGALYFVE MDADGGKAKY AGAKPGAEYG LGYCDAQCPS DLKFINGEAN SEGWKPQSGD KNAGNGKYGS CCSEMDVWES NSQATALTPH VCKTTGQQRC SGKSECGGQD GQDRFAGLCD EDGCDFNNWR MGDKTFFGPG LIVDTKSPFV VVTQFYGSPV TEIRRKYVQN GKVIENSKSN IPGIDATAAI SDHFCEQQKK AFGDTNDFKN KGGFAKLGQV FDRGMVLVLS LWDDHQVAML WLDSTYPTNK DKSQPGVDRG PCPTSSGKPD DVESASADAT VVYGNIKFGA LDSTY MLTLVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSKDLCP SSNTCSQKCY IEGADYSGTY GIQSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYESFK LKNKEFTFTV DDSKLNCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GNGKLGTCCS EMDIWEGNMK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNNGQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVSSG VPKDVESQYG DATVIYSDIK FGAINSTFKW N MILALLSLAK SLGIATNQAE THPKLTWTRY QSKGSGQTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDANQ RYNGICDKDG CDFNSYRLGD KTFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNIAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG APSDVESQSP DATVTFSDIK FGPIDSTY MLVIALILRG LSVGTGTQQS ETHPSLSWQQ TSKGGSGQSV SGSVVLDSNW RWTHTTDGTT NCYDGNEWSS DLCPDASTCS SNCVLEGADY SGTYGITGSG SSLKLGFVTK GSYSTNIGSR VYLLGDESHY KLFKLENNEF TFTVDDSNLE CGLNGALYFV AMDEDGGASK YSGAKPGAKY GMGYCDAQCP HDMKFINGDA NVEGWKPSDN DENAGTGKWG ACCTEMDIWE ANKYATAYTP HICTKNGEYR CEGTDCGDTK DNNRYGGVCD KDGCDFNSWR MGNQSFWGPG LIIDTGKPVT VVTQFLADGG SLSEIRRKYV QGGKVIENTV TKISGMDEFD SITDEFCNQQ KKAFRDTNDF EKKGGLKGLG TAVDAGVVLV LSLWDDHDVN MLWLDSIYPT DSGSKAGADR GPCATSSGVP KDVESNYASA SVTFSDIKFG PIDSTY MLLALFAFGK SLGIATNQAE NHPKLTWTRY QSKGSGQTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC DLDGADYPGT YGISSSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGIRRCEG TECGDTDANQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNIAGMAAG NSITDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDSGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG APSDVESQSP DATVTFSDIK FGPIDSTY MLASVVYLVS LVVSLEIGTQ QSEEHPKLTW QNGSSSVSGS IVLDSNWRWL HDSGTTNCYD GNLWSDDLCP NADTCSSKCY IEGADYSGTY GITSSGSKVT LKFVTKGSYS TNIGSRIYLL KDENTYETFK LKNKEFTFTV DDSKLDCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GDGKLGTCCS EMDIWEGNAK SQAYTVHACS KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKSPVTVVTQ FIGDPLTEIR RVYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKDATGDT NDFKAKGAMA GFSTNLNTAQ VLVSVHCGMI IQPICCGLIR RIQRIQQKQV QAVDRVLCRR VFQRMLKASM VMLQSRTRTL SLELSTRPLV GISPAGRLFF F MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LKDTKSYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK FGPIDSTY MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD AATCGKNCVL EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD VSNLPCGLNG ALYHVNMDED GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSIC SAVTPHVCDN LQQTRCQGTA CGENGGGSRF GSSCDPDGCD FNSWRMGNKT FYGPGLIVDT KSKFTVVTQF VGNPVTEIKR KYVQNGKVIE NSYSNIEGMD KFNSVSDKFC TAQKKAFGDT DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS DRGPCPTTSG VPADVESKSA DANVIYSDIR FGAIDSTYK MLGALVALAS CIGVGTNTPE KHPDLKWTNG GSSVSGSIVV DSNWRWTHIK GETKNCYDGN LWSDKYCPDA ATCGKNCVLE GADYSGTYGV TTSGDAATLK FVTHGQYSTN VGSRLYLLKD EKTYQMFNLV GKEFTFTVDV SNLPCGLNGA LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GKYGSCCSEM DIWEANSMAT AYTPHVCDKL EQTRCSGSAC GQNGGGDRFS SSCDPDGCDF NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGGSVTEIK RKYVQGGKVI DNSMTNIAAM SKQYNSVSDE FCQAQKKAFG DNDSFTKHGG FRQLGATLSK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GADRGPCKTS SGVPSDVESQ NADSTVKYSD IRFGAIDSTY SK MLAAALFTFA CSVGVGTKTT ETHPKLNWQQ CACKGSCSQV SGEVTMDSNW RWTHDGNGKN CYDGNTWISS LCPDDKTCSD KCVLDGAEYQ ATYGIQSNGT ALTPKFVTHG SYSTNIGSRL YLLKDKSTYY VFQLNNKEFT FSVDVSKLPC GLNGALYFVE MDADGGKSKY AGAKPGAEYG LGYCDAQCPS DLKFINGEAN SEGWKPQSGD KNAGNGKYGS CCSEMDVWES NSMATALTPH VCKTTGQTRC SGKSECGGQD GQDRFAGNCD EDGCDFNNWR MGDKTFFGPG LTVDTKSPFV VVTQFYGSPV TEIRRKYVQN GKVIENAKSN IPGIDATNAI SDTFCEQQKK AFGDTNDFKN KGGFTKLGSV FSRGMVLVLS LWDDHQVAML WLDSTYPTNK DKSVPGVDRG PCPTSSGKPD DVESASGDAT VVYGNIKFGA LDSTY MFGFLLSLFA LQFALEIGTQ TSESHPSITW ELNGARQSGQ IVIDSNWRWL HDSGTTNCYD GNTWSSDLCP DPEKCSQNCY LEGADYSGTY GISASGSQLT LGFVTKGSYS TNIGSRVYLL KDENTYPMFK LKNKEFTFTV DVSNLPCGLN GALYFVAMPS DGGKAKYPLA KPGAKYGMGY CDAQCPHDMK FINGEANVLD WKPQSNDENA GTGRYGTCCT EMDIWEANSQ ATAYTVHACS KNARCEGTEC GDDSASQRYN GICDKDGCDF NSWRWGNKTF FGPGLTVDSS KPVTVVTQFI GDPLTEIRRI WVQGGKVIQN SFTNVSGITS VDSITNTFCD ESKVATGDTN DFKAKGGMSG FSKALDTEVV LVLSLWDDHT ANMLWLDSTY PTDSTAIGAS RGPCATSSGD PKDVESASAN ASVKFSDIKF GALDSTY MLASLLPLSN SLGTASNQAE THPKLTWTQY TGKGAGQTVN GEIVLDSNWR WTHKDGTNCY DGNTWSSSLC PDPTTCSNNC NLDGADYPGT YGITTSGNQL KLGFVTHGSY STNIGSRVYL LRDSKNYQMF KLKNKEFTFT VDDSKLPCGL NGAVYFVAMD EDGGTAKHSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRWGARC TEMDIWEANS RATAYTPHIC TKTGLYRCEG TECGDSDTNR YGGVCDKDGC DFNSYRMGDK SFFGQGKTVD SSKPVTVVTQ FITDNNQDSG KLTEIRRKYV QGGKVIDNSK VNIAGITAGN PITDTFCDEA KKAFGDNNDF EKKGGLSALG TQLEAGFVLV LSLWDDHSVN MLWLDSTYPT NASPGALGVE RGDCAITSGV PADVESQSAD ASVTFSDIKF GPIDSTY MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD AATCGKNCVL EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD VSNLPCGLSG ALYHVNMDED GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSIC SAVTPHVCDN LQQTRCQGAA CGENGGGSRF GSSCDPDGCD FNSWGMGNKT FYGPGLIVDT KSKFTVVTQF VGNPVTEIKR KYVQNGKVIE NSYSNIEGMD KFNSVSDKFC TAQKKAFGDT DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS DRGPCPTTSG VPADVESKSA DANVIYSDIR FGAIDSTYK MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GILSETRRKY VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK FGPIDSTY MIGIVLIQTV FGIGVGTQQS ESHPSLSWQQ CSKGGSCTSV SGSIVLDSNW RWTHIPDGTT NCYDGNEWSS DLCPDPTTCS NNCVLEGADY SGTYGISTSG SSAKLGFVTK GSYSTNIGSR VYLLGDESHY KIFDLKNKEF TFTVDDSNLE CGLNGALYFV AMDEDGGASR FTLAKPGAKY GTGYCDAQCP HDIKFINGEA NVQDWKPSDN DDNAGTGHYG ACCTEMDIWE ANKYATAYTP HICTENGEYR CEGKSCGDSS DDRYGGVCDK DGCDFNSWRL GNQSFWGPGL IIDTGKPVTV VTQFVTKDGT DSGALSEIRR KYVQGGKTIE NTVVKISGID EVDSITDEFC NQQKQAFGDT NDFEKKGGLS GLGKAFDYGV VLVLSLWDDH DVNMLWLDSV YPTNPAGKAG ADRGPCATSS GDPKEVEDKY ASASVTFSDI KFGPIDSTY MLVFGIVSFV YSIGVGTNTA ETHPKLTWKN GGSTTNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD AATCGKNCVL EGADYSGTYG VTSSGDALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD VSQLPCGLNG ALYFVCMDQD GGMSRYPDNQ AGAKYGTGYC DAQCPTDLKF INGLPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSLA TAVTPHVCDQ VGQTRCEGRA CGENGGGDRF GSICDPDGCD FNSWRMGNKT FWGPGLIIDT KKPVTVVTQF IGSPVTEIKR EYVQGGKVIE NSYTNIEGMD KFNSISDKFC TAQKKAFGDN DSFTKHGGFS KLGQSFTKGQ VLVLSLWDDH TVNMLWLDSV YPTNSKKLGS DRGPCPTSSG VPADVESKNA DSSVKYSDIR FGSIDSTYK MLSFVFLLGF GVSLEIGTQQ SENHPTLSWQ QCTSSGSCTS QSGSIVLDSN WRWVHDSGTT NCYDGNEWSS DLCPDPETCS KNCYLDGADY SGTYGITSNG SSLKLGFVTE GSYSTNIGSR VYLKKDTNTY QIFKLKNHEF TFTVDVSNLP CGLNGALYFV EMEADGGKGK YPLAKPGAQY GMGYCDAQCP HDMKFINGNA NVLDWKPQET DENSGNGRYG TCCTEMDIWE ANSQATAYTP HICTKDGQYQ CEGTECGDSD ANQRYNGVCD KDGCDFNSYR LGNKTFFGPG LIVDSKKPVT VVTQFITSNG QDSGDLTEIR RIYVQGGKTI QNSFTNIAGL TSVDSITEAF CDESKDLFGD TNDFKAKGGF TAMGKSLDTG VVLVLSLWDD HSVNMLWLDS TYPTDAAAGA LGTQRGPCAT SSGAPSDVES QSPDASVTFS DIKFGPLDST Y MLTLVVYLLS LVVSLEIGTQ QSESHPALTW QREGSSASGS IVLDSNWRWV HDSGTTNCYD GNEWSTDLCP SSDTCTQKCY IEGADYSGTY GITTSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKNKEFTFTV DDSKLDCGLN GALYFVAMDA DGGKQKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVED WKPQDNDENS GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGT DCGDSDSRYQ GTCDKDGCDY ASYRWGDHSF YGEGKTVDTK QPITVVTQFI GDPLTEIRRL YIQGGKVINN SKTQNLASVY DSITDAFCDA TKAASGDTND FKAKGAMAGF SKNLDTPQVL VLSLWDDHTA NMLWLDSTYP TDSRDATAER GPCATSSGVP KDVESNQADA SVVFSDIKFG AINSTYSYN MFGFLLSLFA LQFALEIGTQ TSESHPSITW ELNGARQSGQ IVIDSNWRWL HDSGTTNCYD GNTWSSDLCP DPEKCSQNCY LEGADYSGTY GISASGSQLT LGFVTKGSYS TNIGSRVYLL KDENTYQMFK LKNKEFTFTV DVSNLPCGLN GALYFVAMPS DGGKAKYPLA KPGAKYGMGY CDAQCPHDMK FINGEANVLD WKPQSNDENA GTGRYGTCCT EMDIWEANSQ ATAYTVHACS KNARCEGTEC GDDSASQRYN GICDKDGCDF NSWRWGNKTF FGPGLTVDSS KPVTVVTQFI GDPLTEIRRI WVQGGKVIQN SFTNVSGITS VDSITNTFCD ESKVATGDTN DFKAKGGMSG FSKALDTEVV LVLSLWDDHT ANMLWLDSTY PSNSTAIGAT RGPCATSSGD PKNVESASAN ASVKFSDIKF GAFDSTY MLALVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCP SSDTCTSKCY IEGADYSGTY GITSSGSKVT LKFVTKGSYS TNIGSRIYLL KDENTYETFK LKNKEFTFTV DDSQLNCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNTAQ VLVLSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVTSG VPKDVESQYG SAQVVYSDIK FGAINSTY MLALVYFLLS FVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCG SSDTCSSKCY IEGADYSGTY GISASGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKGKEFTFTV DDSKLDCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTID TKQPVTVVTQ FIGDPLTEIR RVYVQGGKVI NNSKTSNLAN VYDSITDKFC DDTKDATGDT NDFKAKGAMS GFSTNLNTAQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVLSG VPKNVESQHG DATVIYSDIK FGAINSTFSY N MFLALFVLGK SLGIATNQAE NHPKLTWTRY QSKGSGQTVN GEVVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPQTCSSNC DLDGADYPGT YGISSSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAME EDGGVAKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC IEMDIWEANS MATAYTPHVC TVTGIHRCEG TECGDTDANQ RYNGICDKDG CDFNSYRMGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDG GTLSEIKRKY VQGGKVIENS KVNIAGITAV NSITDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDLGMVL VLSLWDDHSV NMLWLDSTYP TDAAAGALGT ERGACATSSG KPSDVESQSP DASVTFSDIK FGPIDSTY MLLCLLSIAN SLGVGTNTAE NHPKLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA ATCGKNCVIE GADYQGTYGV SSSGDGLTLT FVTHGQYSTN VGSRLYLMKD EKTYQMFNLN GKEFTFTVDV SNLPCGLNGA LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GKYGSCCSEM DIWEANSQAT AYTPHVCDKL EQTRCSGSSC GHTGGGERFS SSCDPDGCDF NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGNSCTEIK RKYVQGGKVI DNSMSNIAGM SKQYNSVSDD FCQAQKKAFG DNDSFTKHGG FRQLGATLGK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GSDRGPCKTS SGIPADVESQ AASSSVKYSD IRFGAIDSTY K MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD AATCGKNCVL EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLMK DEKTYQMFNL NGKEFTFTVD VSNLPCGLNG ALYHVNMDED GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSIC SAVTPHVCDT LQQTRCQGTA CGENGGGSRF GSSCDPDGCD FNSWRMGNKT FYGPGLIVDT KSKFTVVTQF VGSPVTEIKR KYVQNGKVIE NSFSNIEGMD KFNSISDKFC TAQKKAFGDT DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS DRGPCPTTSG VPADVESKSA NANVIYSDIR FGAIDSTYK MLLCLLGIAS SLDAGTNTAE NHPQLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA ATCGQNCVIE GADYQGTYGV SASGNALTLT FVTHGQYSTN VGSRLYLLKD EKTYQIFNLI GKEFTFTVDV SNLPCGLNGA LYFVQMDADG GTAKYSDNKA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GRYGSCCSEM DVWEANSLAT AYTPHVCDKL EQVRCDGRAC GQNGGGDRFS SSCDPDGCDF NSWRLGNKTF WGPGLIVDTK QPVQVVTQWV GSGTSVTEIK RKYVQGGKVI DNSFTKLDSL TKQYNSVSDE FCVAQKKAFG DNDSFTKHGG FRQLGATLAK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GADRGPCKTS SGVPADVESQ AASSSVKYSD IRFGAIDSTY K MLGIGFVCIV YSLGVGTNTA ENHPKLTWKN SGSTTNGEVT VDSNWRWTHT KGTTKNCYDG NLWSKDLCPD AATCGKNCVL EGADYSGTYG VTSSGDALTL KFVTHGSYST NVGSRLYLLK DEKTYQIFNL NGKEFTFTVD VSNLPCGLNG ALYFVNMDAD GGTGRYPDNQ AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSLA TAVTPHVCDQ VGQTRCEGRA CGENGGGDRF GSSCDPDGCD FNSWRLGNKT FWGPGLIVDT KKPVTVVTQF VGSPVTEIKR KYVQGGKVIE NSYTNIEGLD KFNSISDKFC TAQKKAFGDN DSFIKHGGFR QLGQSFTKGQ VLVLSLWDDH TVNMLWLDSV YPTNSKKPG DRGPCPTSSG VPADVESKNA GSSVKYSDIR FGSIDSTYK MATLVGILVS LFALEVALEI GTQTSESHPS LSWELNGQRQ TGSIVIDSNW RWLHDSGTTN CYDGNEWSSD LCPDPEKCSQ NCYLEGADYS GTYGISSSGN SLQLGFVTKG SYSTNIGSRV YLLKDENTYA TFKLKNKEFT FTADVSNLPC GLNGALYFVA MPADGGKSKY PLAKPGAKYG MGYCDAQCPH DMKFINGEAN ILDWKPSSND ENAGAGRYGT CCTEMDIWEA NSQATAYTVH ACSKNARCEG TECGDDDGRY NGICDKDGCD FNSWRWGNKT FFGPNLIVDS SKPVTVVTQF IGDPLTEIRR IYVQGGKVIQ NSFTNISGVA SVDSITDAFC NENKVATGDT NDFKAKGGMS GFSKALDTEV VLVLSLWDDH TANMLWLDST YPTDSSALGA SRGPCAITSG EPKDVESASA NASVKFSDIK FGAIDSTY MLTLVYFLLS LVVSLEIGTQ QSESHPQLSW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCP SSDTCTSKCY IEGADYSGTY GITSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKNKEFTFTV DDSKLDCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPLTVVTQ FVGDPLTEIR RVYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNTAQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVSSG VPKDVESQHG DATVIYSDIK FGAINSTFKW N MLSLVSIFLV GLGFSLGVGT QQSESHPSLS WQNCSAKGSC QSVSGSIVLD SNWRWLHDSG TTNCYDGNEW STDLCPDAST CDKNCYIEGA DYSGTYGITS SGAQLKLGFV TKGSYSTNIG SRVYLLRDES HYQLFKLKNH EFTFTVDDSQ LPCGLNGALY FVEMAEDGGA KPGAQYGMGY CDAQCPHDMK FITGEANVKD WKPQETDENA GNGHYGACCT EMDIWEANSQ ATAYTPHICS KTGIYRCEGT ECGDNDANQR YNGVCDKDGC DFNSYRLGNK TFWGPGLTVD SNKAMIVVTQ FTTSNNQDSG ELSEIRRIYV QGGKTIQNSD TNVQGITTTN KITQAFCDET KVTFGDTNDF KAKGGFSGLS KSLESGAVLV LSLWDDHSVN MLWLDSTYPT DSAGKPGADR GPCAITSGDP KDVESQSPNA SVTFSDIKFG PIDSTY MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LKDTKSYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGFGAL SKQLVAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK FGPIDSTY MLCVGLFGLV YSIGVGTNTQ ETHPKLSWKQ CSSGGSCTTQ QGSVVIDSNW RWTHSTKDLT NCYDGNLWDS TLCPDGTTCS KNCVLEGADY SGTYGITSSG DSLTLKFVTH GSYSTNVGSR LYLLKDDNNY QIFNLAGKEF TFTVDVSNLP CGLNGALYFV EMDQDGGKGK HKENEAGAKY GTGYCDAQCP TDLKFIDGIA NSDGWKPQDN DENSGNGKYG SCCSEMDIWE ANSLATAYTP HVCDTKGQKR CQGTACGENG GGDRFGSECD PDGCDFNSWR QGNKSFWGPG LIIDTKKSVQ VVTQFIGSGS SVTEIRRKYV QNGKVIENSY STISGTEKYN SISDDYCNAQ KKAFGDTNSF ENHGGFKRFS QHIQDMVLVL SLWDDHTVNM LWLDSVYPTN SNKPGADRGP CETSSGVPAD VESKSASASV KYSDIRFGPI DSTYK MLLCLWSIAY SLGVGTNTAE NHPKLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA ATCGKNCVIE GADYQGTYGV SASGDGLTLT FVTHGQYSTN VGSRLYLMKD EKTYQIFNLN GKEFTFTVDV SNLPCGLNGA LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GKYGSCCSEM DIWEANSQAT AYTPHVCDKL EQTRCSGSAC GHTGGGERFS SSCDPDGCDF NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGNSCTEIK RKYVQGGKVI DNSMSNIAGM TKQYNSVSDD FCQAQKKAFG DNDSFTKHGG FRQLGATLGK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GSDRGPCKTS SGIPADVESQ AASSSVKYSD IRFGAIDSTY K SEQ ID NO: 299 QSACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN CCLDGAAYAS TYGVTTSGNS LSIGFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWEANS ISEALTPHPC TTVGQEICEG DGCGGTYSDN AYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG NPSGGNPPGG NPPGTTTTRR PATTTGSSPG PTQSHYGQCG GIGYSGPTVC ASGTTCQVLN PYYSQCL SEQ ID NO: 300 QSACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN CCLDGAAYAS TYGVTTSGNS LSIGFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWEANS ISEALTPHPC TTVGQEICEG DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVAGSCSTS SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG NPSGGNPPGG NPPGTTTTRR PATTTGSSPG PTQSHYGQCG GIGYSGPTVC ASGTTCQVLN PYYSQCL SEQ ID NO: 301 MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN TSTNCYTGNT WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSHLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSSNN ANTGLGNHGA CCAELDIWEA NSISEALTPH PCDTPGLSVC TTDACGGTYS SDKYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPITV VTQFVTDDGT STGTLSEIRR YYVQNGVVIP QPSSKISGVS GNVINSDFCD AEISTFGETA SFSKHGGLAK MGAGMEAGMV LVMSLWDDYS VNMLWLDSTY PTNATGTPGA AKGSCPTTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSSTT TASGTTTTKA SSTSTSSTST GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL SEQ ID NO: 302 QQIGTYTAET HPSLSWSTCK SGGSCTTNSG AITLDANWRW VHGVNTSTNC YTGNTWNTAI CDTDASCAQD CALDGADYSG TYGITTSGNS LRLNFVTGSN VGSRTYLMAD NTHYQIFDLL NQEFTFTVDV SHLPCGLNGA LYFVTMDADG GVSKYPNNKA GAQYGVGYCD SQCPRDLKFI AGQANVEGWT PSSNNANTGL GNHGACCAEL DIWEANSISE ALTPHPCDTP GLSVCTTDAC GGTYSSDKYA GTCDPDGCDF NPYRLGVTDF YGSGKTVDTT KPITVVTQFV TDDGTSTGTL SEIRRYYVQN GVVIPQPSSK ISGVSGNVIN SDFCDAEIST FGETASFSKH GGLAKMGAGM EAGMVLVMSL WDDYSVNMLW LDSTYPTNAT GTPGAAKGSC PTTSGDPKTV ESQSGSSYVT FSDIRVGPFN STFSGGSSTG GSSTTTASGT TTTKASSTST SSTSTGTGVA AHWGQCGGQG WTGPTTCASG TTCTVVNPYY SQCL 

What is claimed is:
 1. A polypeptide comprising a variant cellobiohydrolase I (“CBH I”) catalytic domain as compared to a reference CBH I catalytic domain, comprising: (a) a substitution at the amino acid position corresponding to R268 of T. reesei CBH I (“R268 substitution”); (b) a substitution at the amino acid position corresponding to R411 of T. reesei CBH I (“R411 substitution”); or (c) both an R268 substitution and an R411 substitution, wherein substitution (a), (b) or (c) decreases product inhibition as compared to the reference CBH I catalytic domain.
 2. A method for producing ethanol, comprising: (a) treating biomass with a composition according to any one of claims 37 to 43 or with a fermentation broth according to claim 1, thereby producing monosaccharides; and (b) culturing a fermenting microorganism in the presence of the monosaccharides produced in step (a) under fermentation conditions, thereby producing ethanol.
 3. The method of claim 2, further comprising, prior to step (a), pretreating the biomass.
 4. The method of claim 2, wherein said fermenting microorganism is a bacterium or a yeast.
 5. The method of claim 4, wherein said fermenting microorganism is a bacterium selected from Zymomonas mobilis, Escherichia coli and Klebsiella oxytoca.
 6. The method of claim 4, wherein said fermenting microorganism is a yeast selected from Saccharomyces cerevisiae, Saccharomyces uvarum, Kluyveromyces fragilis, Kluyveromyces lactis, Candida pseudotropicalis, and Pachysolen tannophilus.
 7. The method of claim 2, wherein said biomass is corn stover, bagasses, sorghum, giant reed, elephant grass, miscanthus, Japanese cedar, wheat straw, switchgrass, hardwood pulp, softwood pulp, crushed sugar cane, energy cane, or Napier grass.
 8. A method for generating a nucleic acid that encodes a product tolerant variant CBH I polypeptide, comprising modifying the nucleotide sequence of a CBH I-encoding nucleic acid so that the nucleic acid encodes a variant CBH I polypeptide, wherein said variant CBH I polypeptide comprises: (i) an R268 substitution; (ii) an R411 substitution; or (iii) both an R268 substitution and an R411 substitution, thereby generating a nucleic acid that encodes a product tolerant variant CBH I polypeptide.
 9. The method of claim 8, wherein the modification is by site directed mutagenesis.
 10. The method of claim 8, wherein variant CBH I polypeptide comprises an R268 substitution.
 11. The method of claim 10, wherein the R268 substituent is a lysine.
 12. The method of claim 10, wherein the R268 substituent is an alanine.
 13. The method of claim 8, which comprises an R411 substitution.
 14. The method of claim 13, wherein the R411 substituent is a lysine.
 15. The method of claim 13, wherein the R411 substituent is an alanine.
 16. A method for producing ethanol, comprising: (a) treating biomass with a fermentation broth according to claim 1, thereby producing monosaccharides; and (b) culturing a fermenting microorganism in the presence of the monosaccharides produced in step (a) under fermentation conditions, thereby producing ethanol. 